Project Nature & Content: Computer Vision
Motivation and Background Documentation
The primary intent of this assignment is to give you hands-on, practical experience with understanding the transition from simple (single hidden layer) to deep (multiple hidden layers) networks.
This hinges on understanding how hidden nodes learn to extract features from their inputs. When there are multiple hidden node layers, each successive layer extracts more generalized and abstract features.
When a hidden layer "learns" the kinds of features that are inherent in its input data, it is using a generative method. In this case, we're not telling it what those feature classes are; it has to figure them out on its own.
What we will pragmatically do is emulate how a hidden layer learns features by constructing "classes" of input data - where we think that the classes share similar features. We'll put input data into those classes that we THINK have similar features. Then, we conduct experiments to determine what the hidden nodes are actually learning.
You will have gathered and preprocessed your data, designed and refined your network structure, trained and tested the network, varied the hyperparameters to improve performance and analyzed/assessed the results.
The most important thing is not just to give a summary of classification rates/errors. I trust that you can get a working classifier, or can train a network to do any useful task.
You are welcome to use the CIFAR-10 data for this exercise. You are welcome to use Python with user-defined functions, Python with TensorFlow, and/or Python with Keras. For example, you can conduct the following experiments on the CIFAR-10 data. The goal is to compare DNN and CNN architectures. In all the experiments, you may hold some parameters constants - for example, the batch size to 100, the number of epochs to 20, same optimizer, same loss function of cross entropy, so that the comparisons are fair.
Experiment 1: DNN with 2 layers (no regularization)
Experiment 2: DNN with 3 layers (no regularization)
Experiment 3: CNN with 2 convolution/max pooling layers (no regularization)
Experiment 4: CNN with 3 convolution/max pooling layers (no regularization)
Experiment 5+ : You will conduct several more experiments. (a) Redo all the 4 experiments with some regularization technique. (b) Create more experiments on your own by tweaking architectures and/or hyper parameters.
Result1: Create a table with the accuracy and loss for train/test/validation & process time for ALL the models.
Result2: Take Experiment 3 – Extract the outputs from 2 filters from the 2 max pooling layers and visualize them in a grid as images. See whether the ‘lighted’ up regions correspond to some features in the original images.
Import packages needed¶
import numpy as np
import time
import pandas as pd
from packaging import version
from collections import Counter
import random
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.metrics import accuracy_score
from sklearn.metrics import mean_squared_error as MSE
from sklearn.model_selection import train_test_split
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import models, layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, BatchNormalization, Dropout, Flatten, Dense
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.preprocessing import image
from tensorflow.keras.utils import to_categorical
import tensorflow.keras.backend as k
# from tensorflow.keras.optimizers.legacy import Adam
from tensorflow.python.client import device_lib
import warnings
warnings.filterwarnings('ignore')
from IPython.display import display, HTML
display(HTML("<style>.container { width:80% !important; }</style>"))
Verify TensorFlow Version and Keras Version¶
print("This notebook requires TensorFlow 2.0 or above")
print("TensorFlow version: ", tf.__version__)
assert version.parse(tf.__version__).release[0] >=2
This notebook requires TensorFlow 2.0 or above TensorFlow version: 2.15.0
seed_val = 43
# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.
np.random.seed(seed_val)
# The below is necessary for starting core Python generated random numbers
# in a well-defined state.
random.seed(seed_val)
# The below set_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see:
# https://www.tensorflow.org/api_docs/python/tf/random/set_seed
tf.random.set_seed(seed_val)
EDA Functions¶
def get_three_classes(x, y):
def indices_of(class_id):
indices, _ = np.where(y == float(class_id))
return indices
indices = np.concatenate([indices_of(0), indices_of(1), indices_of(2)], axis=0)
x = x[indices]
y = y[indices]
count = x.shape[0]
indices = np.random.choice(range(count), count, replace=False)
x = x[indices]
y = y[indices]
y = tf.keras.utils.to_categorical(y)
return x, y
def show_random_examples(x, y, p):
indices = np.random.choice(range(x.shape[0]), 10, replace=False)
x = x[indices]
y = y[indices]
p = p[indices]
plt.figure(figsize=(10, 5))
for i in range(10):
plt.subplot(2, 5, i + 1)
plt.imshow(x[i])
plt.xticks([])
plt.yticks([])
col = 'green' if np.argmax(y[i]) == np.argmax(p[i]) else 'red'
plt.xlabel(class_names_preview[np.argmax(p[i])], color=col)
plt.show()
Research Assignment Reporting Functions¶
def plot_history(history):
losses = history.history['loss']
accs = history.history['accuracy']
val_losses = history.history['val_loss']
val_accs = history.history['val_accuracy']
epochs = len(losses)
plt.figure(figsize=(16, 4))
for i, metrics in enumerate(zip([losses, accs], [val_losses, val_accs], ['Loss', 'Accuracy'])):
plt.subplot(1, 2, i + 1)
plt.plot(range(epochs), metrics[0], label='Training {}'.format(metrics[2]))
plt.plot(range(epochs), metrics[1], label='Validation {}'.format(metrics[2]))
plt.legend()
plt.show()
def display_training_curves(training, validation, title, subplot):
ax = plt.subplot(subplot)
ax.plot(training)
ax.plot(validation)
ax.set_title('model '+ title)
ax.set_ylabel(title)
ax.set_xlabel('epoch')
ax.legend(['training', 'validation'])
def print_validation_report(y_test, predictions):
print("Classification Report")
print(classification_report(y_test, predictions))
print('Accuracy Score: {}'.format(accuracy_score(y_test, predictions)))
print('Root Mean Square Error: {}'.format(np.sqrt(MSE(y_test, predictions))))
def plot_confusion_matrix(y_true, y_pred):
mtx = confusion_matrix(y_true, y_pred)
fig, ax = plt.subplots(figsize=(16,12))
sns.heatmap(mtx, annot=True, fmt='d', linewidths=.75, cbar=False, ax=ax,cmap='Blues',linecolor='white')
# square=True,
plt.ylabel('true label')
plt.xlabel('predicted label')
Loading cifar10 Dataset¶
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 170498071/170498071 [==============================] - 2s 0us/step
EDA¶
print('train_images:\t{}'.format(x_train.shape))
print('train_labels:\t{}'.format(y_train.shape))
print('test_images:\t{}'.format(x_test.shape))
print('test_labels:\t{}'.format(y_test.shape))
train_images: (50000, 32, 32, 3) train_labels: (50000, 1) test_images: (10000, 32, 32, 3) test_labels: (10000, 1)
print("First ten labels training dataset:\n {}\n".format(y_train[0:10]))
print("This output the numeric label, need to convert to item description")
First ten labels training dataset: [[6] [9] [9] [4] [1] [1] [2] [7] [8] [3]] This output the numeric label, need to convert to item description
(train_images, train_labels),(test_images, test_labels)= tf.keras.datasets.cifar10.load_data()
x_preview, y_preview = get_three_classes(train_images, train_labels)
x_preview, y_preview = get_three_classes(test_images, test_labels)
class_names_preview = ['aeroplane', 'car', 'bird']
show_random_examples(x_preview, y_preview, y_preview)
plt.figure(figsize = (12 ,8))
items = [{'Class': x, 'Count': y} for x, y in Counter(train_labels.ravel()).items()]
distribution = pd.DataFrame(items).sort_values(['Class'])
sns.barplot(x=distribution.Class, y=distribution.Count);
class_names = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck']
Create Validation Data Set¶
x_train_split, x_valid_split, y_train_split, y_valid_split = train_test_split(x_train
,y_train
,test_size=.1
,random_state=seed_val
,shuffle=True)
Confirm Datasets {Train, Validation, Test}¶
print("Training\t", x_train_split.shape,
"\nValidation\t", x_valid_split.shape,
"\nTest\t\t", x_test.shape)
Training (45000, 32, 32, 3) Validation (5000, 32, 32, 3) Test (10000, 32, 32, 3)
Rescale Examples {Train, Validation, Test}¶
The images are 28x28 NumPy arrays, with pixel values ranging from 0 to 255
- Each element in each example is a pixel value
- Pixel values range from 0 to 255
- 0 = black
- 255 = white
x_train_norm = x_train_split/255
x_valid_norm = x_valid_split/255
x_test_norm = x_test/255
Create the Model¶
results = {}
Experiment 1:¶
- DNN with 2 layers
- no regularization
Build CNN Model¶
k.clear_session()
model_01 = Sequential([
Flatten(input_shape=x_train_norm.shape[1:]),
Dense(units=384,activation=tf.nn.relu),
Dense(units=768,activation=tf.nn.relu),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment1"] = {}
results["Experiment1"]["Architecture"] = "• DNN with 2 layers\n • no regularization"
2024-10-20 03:42:39.377699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 20974 MB memory: -> device: 0, name: NVIDIA L4, pci bus id: 0000:35:00.0, compute capability: 8.9
model_01.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 3072) 0
dense (Dense) (None, 384) 1180032
dense_1 (Dense) (None, 768) 295680
dense_2 (Dense) (None, 10) 7690
=================================================================
Total params: 1483402 (5.66 MB)
Trainable params: 1483402 (5.66 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
keras.utils.plot_model(model_01, "CIFAR10_EXP_01.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_01.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_01 = model_01.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_01_2DNN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=7),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment1"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
2024-10-20 03:42:41.986097: I external/local_xla/xla/service/service.cc:168] XLA service 0x7fdc641c7bb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2024-10-20 03:42:41.986130: I external/local_xla/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA L4, Compute Capability 8.9 2024-10-20 03:42:42.004663: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable. 2024-10-20 03:42:42.038574: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8902 WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1729395762.154273 4048 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
704/704 [==============================] - 3s 2ms/step - loss: 1.8620 - accuracy: 0.3257 - val_loss: 1.7651 - val_accuracy: 0.3610 Epoch 2/200 704/704 [==============================] - 1s 2ms/step - loss: 1.6743 - accuracy: 0.3978 - val_loss: 1.6542 - val_accuracy: 0.4146 Epoch 3/200 704/704 [==============================] - 1s 2ms/step - loss: 1.5933 - accuracy: 0.4304 - val_loss: 1.6508 - val_accuracy: 0.4144 Epoch 4/200 704/704 [==============================] - 1s 2ms/step - loss: 1.5277 - accuracy: 0.4515 - val_loss: 1.5871 - val_accuracy: 0.4398 Epoch 5/200 704/704 [==============================] - 1s 2ms/step - loss: 1.4943 - accuracy: 0.4641 - val_loss: 1.5559 - val_accuracy: 0.4500 Epoch 6/200 704/704 [==============================] - 1s 2ms/step - loss: 1.4663 - accuracy: 0.4741 - val_loss: 1.5530 - val_accuracy: 0.4480 Epoch 7/200 704/704 [==============================] - 1s 2ms/step - loss: 1.4398 - accuracy: 0.4827 - val_loss: 1.5217 - val_accuracy: 0.4706 Epoch 8/200 704/704 [==============================] - 1s 2ms/step - loss: 1.4171 - accuracy: 0.4948 - val_loss: 1.5630 - val_accuracy: 0.4478 Epoch 9/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3965 - accuracy: 0.4995 - val_loss: 1.5715 - val_accuracy: 0.4530 Epoch 10/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3756 - accuracy: 0.5081 - val_loss: 1.5471 - val_accuracy: 0.4564 Epoch 11/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3568 - accuracy: 0.5138 - val_loss: 1.5371 - val_accuracy: 0.4624 Epoch 12/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3382 - accuracy: 0.5185 - val_loss: 1.5280 - val_accuracy: 0.4652 Epoch 13/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3255 - accuracy: 0.5238 - val_loss: 1.5387 - val_accuracy: 0.4616 Epoch 14/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3111 - accuracy: 0.5298 - val_loss: 1.5503 - val_accuracy: 0.4612 Time taken to train Model: 21.78 seconds
train_loss = history_01.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_01.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_01.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_01.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_01 = tf.keras.models.load_model("A2_Exp_01_2DNN.h5")
test_loss, test_accuracy = model_01.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment1"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment1"]["Test Loss"] = round(test_loss,3)
results["Experiment1"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment1"]["Train Loss"] = round(train_loss,3)
results["Experiment1"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment1"]["Validation Loss"] = round(val_loss,3)
Training Loss: 1.311, Training Accuracy: 0.530 Validation Loss: 1.550, Validation Accuracy: 0.461 Test Loss: 1.478, Test Accuracy: 0.474
pred01 = model_01.predict(x_test_norm)
print('shape of preds: ', pred01.shape)
history_01_dict = history_01.history
history_01_dict.keys()
313/313 [==============================] - 1s 911us/step shape of preds: (10000, 10)
dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
history__01_df=pd.DataFrame(history_01_dict)
history__01_df.tail().round(3)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 9 | 1.376 | 0.508 | 1.547 | 0.456 |
| 10 | 1.357 | 0.514 | 1.537 | 0.462 |
| 11 | 1.338 | 0.518 | 1.528 | 0.465 |
| 12 | 1.325 | 0.524 | 1.539 | 0.462 |
| 13 | 1.311 | 0.530 | 1.550 | 0.461 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_01.history['accuracy'], history_01.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_01.history['loss'], history_01.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred01_cm=np.argmax(pred01, axis=1)
print_validation_report(y_test, pred01_cm)
Classification Report
precision recall f1-score support
0 0.56 0.47 0.51 1000
1 0.64 0.51 0.57 1000
2 0.38 0.27 0.32 1000
3 0.34 0.30 0.32 1000
4 0.42 0.34 0.38 1000
5 0.40 0.33 0.36 1000
6 0.44 0.65 0.53 1000
7 0.49 0.57 0.53 1000
8 0.52 0.73 0.61 1000
9 0.52 0.57 0.54 1000
accuracy 0.47 10000
macro avg 0.47 0.47 0.47 10000
weighted avg 0.47 0.47 0.47 10000
Accuracy Score: 0.4743
Root Mean Square Error: 3.190235101054466
plot_confusion_matrix(y_test,pred01_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred01[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2.77% | 27.15% | 4.39% | 34.81% | 3.44% | 12.96% | 1.63% | 1.93% | 7.93% | 2.99% |
| 1 | 12.46% | 10.69% | 0.89% | 0.03% | 0.16% | 0.03% | 0.06% | 0.17% | 55.51% | 20.00% |
| 2 | 22.51% | 4.83% | 0.11% | 0.07% | 0.06% | 0.06% | 0.01% | 0.32% | 63.13% | 8.92% |
| 3 | 47.36% | 5.11% | 1.19% | 0.26% | 1.15% | 0.60% | 0.04% | 4.09% | 37.21% | 2.98% |
| 4 | 0.14% | 0.32% | 12.79% | 2.22% | 48.58% | 1.60% | 33.94% | 0.22% | 0.15% | 0.04% |
| 5 | 0.64% | 1.33% | 10.41% | 27.64% | 3.89% | 4.59% | 45.16% | 5.36% | 0.17% | 0.83% |
| 6 | 29.74% | 30.35% | 2.03% | 19.15% | 0.52% | 11.03% | 1.32% | 4.34% | 1.28% | 0.25% |
| 7 | 0.53% | 0.44% | 33.00% | 7.02% | 10.17% | 2.22% | 45.86% | 0.30% | 0.09% | 0.36% |
| 8 | 8.12% | 1.57% | 28.15% | 7.73% | 27.35% | 8.56% | 3.97% | 13.19% | 0.99% | 0.37% |
| 9 | 0.93% | 66.77% | 1.43% | 1.65% | 0.35% | 0.55% | 0.01% | 0.20% | 5.73% | 22.37% |
| 10 | 24.96% | 0.38% | 5.37% | 7.07% | 4.28% | 11.15% | 4.48% | 0.40% | 41.66% | 0.25% |
| 11 | 0.12% | 24.82% | 0.14% | 0.65% | 0.07% | 0.25% | 0.14% | 0.11% | 13.47% | 60.23% |
| 12 | 4.03% | 11.27% | 13.19% | 14.34% | 15.02% | 12.00% | 24.42% | 3.10% | 1.46% | 1.18% |
| 13 | 17.55% | 0.61% | 1.98% | 0.12% | 0.24% | 1.83% | 0.12% | 76.64% | 0.42% | 0.49% |
| 14 | 4.44% | 53.18% | 5.58% | 0.79% | 0.18% | 1.24% | 0.08% | 1.86% | 0.67% | 31.98% |
| 15 | 12.92% | 1.14% | 1.75% | 3.11% | 1.72% | 13.57% | 0.94% | 0.92% | 60.53% | 3.40% |
| 16 | 0.19% | 7.07% | 1.14% | 44.20% | 0.08% | 17.42% | 0.17% | 20.49% | 0.92% | 8.32% |
| 17 | 23.75% | 1.56% | 10.05% | 4.80% | 25.29% | 3.16% | 3.69% | 9.88% | 10.61% | 7.21% |
| 18 | 3.22% | 2.97% | 0.08% | 0.03% | 0.25% | 0.01% | 0.01% | 0.18% | 92.46% | 0.79% |
| 19 | 0.24% | 2.92% | 6.11% | 3.29% | 1.78% | 7.92% | 65.50% | 9.72% | 0.09% | 2.43% |
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model_01.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model_01 = tf.keras.models.Model(inputs=model_01.input, outputs=layer_outputs)
# Get activation values for the last dense layer
activations_01 = activation_model_01.predict(x_valid_norm[:2000])
dense_layer_activations_01 = activations_01[-3]
output_layer_activations_01 = activations_01[-1]
63/63 [==============================] - 0s 873us/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_01 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_01 = tsne_01.fit_transform(dense_layer_activations_01)
# Scaling
tsne_results_01 = (tsne_results_01 - tsne_results_01.min()) / (tsne_results_01.max() - tsne_results_01.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 2000 samples in 0.000s... [t-SNE] Computed neighbors for 2000 samples in 0.109s... [t-SNE] Computed conditional probabilities for sample 1000 / 2000 [t-SNE] Computed conditional probabilities for sample 2000 / 2000 [t-SNE] Mean sigma: 4.205060 [t-SNE] KL divergence after 250 iterations with early exaggeration: 73.325455 [t-SNE] KL divergence after 300 iterations: 2.298152
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_01[:,0],tsne_results_01[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_01[:,0],tsne_results_01[:,1], c=y_valid_split[:2000], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_01):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 2¶
- DNN with 3 layers
- no regularization
Build CNN Model¶
k.clear_session()
model_02 = Sequential([
Flatten(input_shape=x_train_norm.shape[1:]),
Dense(units=384,activation=tf.nn.relu),
Dense(units=768,activation=tf.nn.relu),
Dense(units=1536,activation=tf.nn.relu),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment2"] = {}
results["Experiment2"]["Architecture"] = "• DNN with 3 layers\n • no regularization"
model_02.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 3072) 0
dense (Dense) (None, 384) 1180032
dense_1 (Dense) (None, 768) 295680
dense_2 (Dense) (None, 1536) 1181184
dense_3 (Dense) (None, 10) 15370
=================================================================
Total params: 2672266 (10.19 MB)
Trainable params: 2672266 (10.19 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
keras.utils.plot_model(model_02, "CIFAR10_EXP_02.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_02.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_02 = model_02.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_02_3DNN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment2"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200 704/704 [==============================] - 3s 2ms/step - loss: 1.8682 - accuracy: 0.3207 - val_loss: 1.8131 - val_accuracy: 0.3428 Epoch 2/200 704/704 [==============================] - 2s 2ms/step - loss: 1.6875 - accuracy: 0.3922 - val_loss: 1.6631 - val_accuracy: 0.4002 Epoch 3/200 704/704 [==============================] - 2s 2ms/step - loss: 1.6053 - accuracy: 0.4210 - val_loss: 1.6479 - val_accuracy: 0.4274 Epoch 4/200 704/704 [==============================] - 2s 2ms/step - loss: 1.5433 - accuracy: 0.4454 - val_loss: 1.6016 - val_accuracy: 0.4276 Epoch 5/200 704/704 [==============================] - 2s 2ms/step - loss: 1.5096 - accuracy: 0.4591 - val_loss: 1.5565 - val_accuracy: 0.4464 Epoch 6/200 704/704 [==============================] - 2s 2ms/step - loss: 1.4642 - accuracy: 0.4721 - val_loss: 1.5056 - val_accuracy: 0.4610 Epoch 7/200 704/704 [==============================] - 1s 2ms/step - loss: 1.4304 - accuracy: 0.4871 - val_loss: 1.5461 - val_accuracy: 0.4504 Epoch 8/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3962 - accuracy: 0.4967 - val_loss: 1.5182 - val_accuracy: 0.4662 Epoch 9/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3642 - accuracy: 0.5113 - val_loss: 1.5384 - val_accuracy: 0.4670 Epoch 10/200 704/704 [==============================] - 1s 2ms/step - loss: 1.3334 - accuracy: 0.5178 - val_loss: 1.5186 - val_accuracy: 0.4674 Epoch 11/200 704/704 [==============================] - 1s 2ms/step - loss: 1.2867 - accuracy: 0.5344 - val_loss: 1.5450 - val_accuracy: 0.4644 Epoch 12/200 704/704 [==============================] - 1s 2ms/step - loss: 1.2536 - accuracy: 0.5486 - val_loss: 1.5577 - val_accuracy: 0.4644 Epoch 13/200 704/704 [==============================] - 1s 2ms/step - loss: 1.2116 - accuracy: 0.5608 - val_loss: 1.5392 - val_accuracy: 0.4816 Epoch 14/200 704/704 [==============================] - 1s 2ms/step - loss: 1.1612 - accuracy: 0.5789 - val_loss: 1.6585 - val_accuracy: 0.4562 Epoch 15/200 704/704 [==============================] - 1s 2ms/step - loss: 1.1117 - accuracy: 0.5966 - val_loss: 1.6516 - val_accuracy: 0.4562 Epoch 16/200 704/704 [==============================] - 1s 2ms/step - loss: 1.0550 - accuracy: 0.6170 - val_loss: 1.7009 - val_accuracy: 0.4562 Epoch 17/200 704/704 [==============================] - 1s 2ms/step - loss: 1.0008 - accuracy: 0.6342 - val_loss: 1.7092 - val_accuracy: 0.4636 Epoch 18/200 704/704 [==============================] - 1s 2ms/step - loss: 0.9376 - accuracy: 0.6580 - val_loss: 1.7983 - val_accuracy: 0.4670 Epoch 19/200 704/704 [==============================] - 1s 2ms/step - loss: 0.8947 - accuracy: 0.6726 - val_loss: 1.8620 - val_accuracy: 0.4704 Epoch 20/200 704/704 [==============================] - 1s 2ms/step - loss: 0.8237 - accuracy: 0.6989 - val_loss: 1.9702 - val_accuracy: 0.4608 Epoch 21/200 704/704 [==============================] - 1s 2ms/step - loss: 0.7661 - accuracy: 0.7220 - val_loss: 2.0270 - val_accuracy: 0.4550 Epoch 22/200 704/704 [==============================] - 1s 2ms/step - loss: 0.7078 - accuracy: 0.7425 - val_loss: 2.1814 - val_accuracy: 0.4558 Epoch 23/200 704/704 [==============================] - 1s 2ms/step - loss: 0.6547 - accuracy: 0.7624 - val_loss: 2.3066 - val_accuracy: 0.4538 Time taken to train Model: 36.06 seconds
train_loss = history_02.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_02.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_02.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_02.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_02 = tf.keras.models.load_model("A2_Exp_02_3DNN.h5")
test_loss, test_accuracy = model_02.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment2"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment2"]["Test Loss"] = round(test_loss,3)
results["Experiment2"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment2"]["Train Loss"] = round(train_loss,3)
results["Experiment2"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment2"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.655, Training Accuracy: 0.762 Validation Loss: 2.307, Validation Accuracy: 0.454 Test Loss: 1.466, Test Accuracy: 0.474
pred02 = model_02.predict(x_test_norm)
print('shape of preds: ', pred02.shape)
history_02_dict = history_02.history
history_02_df=pd.DataFrame(history_02_dict)
history_02_df.tail().round(3)
313/313 [==============================] - 0s 896us/step shape of preds: (10000, 10)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 18 | 0.895 | 0.673 | 1.862 | 0.470 |
| 19 | 0.824 | 0.699 | 1.970 | 0.461 |
| 20 | 0.766 | 0.722 | 2.027 | 0.455 |
| 21 | 0.708 | 0.743 | 2.181 | 0.456 |
| 22 | 0.655 | 0.762 | 2.307 | 0.454 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_02.history['accuracy'], history_02.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_02.history['loss'], history_02.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred02_cm=np.argmax(pred02, axis=1)
print_validation_report(y_test, pred02_cm)
Classification Report
precision recall f1-score support
0 0.56 0.51 0.53 1000
1 0.64 0.52 0.57 1000
2 0.39 0.17 0.24 1000
3 0.34 0.38 0.36 1000
4 0.37 0.44 0.40 1000
5 0.43 0.28 0.34 1000
6 0.46 0.62 0.53 1000
7 0.51 0.56 0.53 1000
8 0.52 0.69 0.60 1000
9 0.51 0.57 0.54 1000
accuracy 0.47 10000
macro avg 0.47 0.47 0.46 10000
weighted avg 0.47 0.47 0.46 10000
Accuracy Score: 0.4742
Root Mean Square Error: 3.16234090508914
plot_confusion_matrix(y_test,pred01_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred02[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3.55% | 0.59% | 9.31% | 45.32% | 4.36% | 23.38% | 8.19% | 0.96% | 3.53% | 0.80% |
| 1 | 6.87% | 14.83% | 0.11% | 0.11% | 0.06% | 0.03% | 0.01% | 0.16% | 25.24% | 52.59% |
| 2 | 19.74% | 13.84% | 0.10% | 0.05% | 0.07% | 0.02% | 0.00% | 0.18% | 47.81% | 18.18% |
| 3 | 20.62% | 5.98% | 1.12% | 0.57% | 1.31% | 0.25% | 0.04% | 1.52% | 59.66% | 8.93% |
| 4 | 0.22% | 0.08% | 5.03% | 3.49% | 47.37% | 2.18% | 40.56% | 0.78% | 0.20% | 0.09% |
| 5 | 2.10% | 1.30% | 7.12% | 22.54% | 18.84% | 7.46% | 36.99% | 1.84% | 0.62% | 1.20% |
| 6 | 5.30% | 3.61% | 4.39% | 59.53% | 0.10% | 19.62% | 4.91% | 1.38% | 0.78% | 0.39% |
| 7 | 1.00% | 1.29% | 20.43% | 18.71% | 15.40% | 13.58% | 25.20% | 1.65% | 0.97% | 1.78% |
| 8 | 4.20% | 0.66% | 18.22% | 13.27% | 33.68% | 12.31% | 1.87% | 9.95% | 4.71% | 1.12% |
| 9 | 2.89% | 59.93% | 1.21% | 3.97% | 0.32% | 1.14% | 0.61% | 0.16% | 8.96% | 20.81% |
| 10 | 20.15% | 0.14% | 4.32% | 11.88% | 0.82% | 16.82% | 4.85% | 1.26% | 39.25% | 0.51% |
| 11 | 0.51% | 13.50% | 0.07% | 0.17% | 0.02% | 0.05% | 0.02% | 0.04% | 1.17% | 84.46% |
| 12 | 2.40% | 4.37% | 7.44% | 26.80% | 7.52% | 22.26% | 12.52% | 11.25% | 2.59% | 2.83% |
| 13 | 13.77% | 8.77% | 1.72% | 0.64% | 1.11% | 5.85% | 0.54% | 63.98% | 2.24% | 1.39% |
| 14 | 4.14% | 41.69% | 4.13% | 1.01% | 0.08% | 2.09% | 0.18% | 1.43% | 1.63% | 43.60% |
| 15 | 7.46% | 1.46% | 3.54% | 6.94% | 4.71% | 8.41% | 5.57% | 2.25% | 55.42% | 4.24% |
| 16 | 0.54% | 0.10% | 4.07% | 38.52% | 2.34% | 42.45% | 3.16% | 6.47% | 1.70% | 0.65% |
| 17 | 10.43% | 8.59% | 5.40% | 11.43% | 20.51% | 8.51% | 6.56% | 10.06% | 9.17% | 9.35% |
| 18 | 3.20% | 1.07% | 0.01% | 0.01% | 0.06% | 0.00% | 0.00% | 0.01% | 94.48% | 1.15% |
| 19 | 0.11% | 0.08% | 3.00% | 3.09% | 20.33% | 3.37% | 62.85% | 6.95% | 0.02% | 0.20% |
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model_02.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model_02 = tf.keras.models.Model(inputs=model_02.input, outputs=layer_outputs)
# Get activation values for the last dense layer
# activations_02 = activation_model_02.predict(x_valid_norm[:3250])
activations_02 = activation_model_02.predict(x_valid_norm[:2000])
dense_layer_activations_02 = activations_02[-3]
output_layer_activations_02 = activations_02[-1]
63/63 [==============================] - 0s 917us/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_02 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_02 = tsne_02.fit_transform(dense_layer_activations_02)
# Scaling
tsne_results_02 = (tsne_results_02 - tsne_results_02.min()) / (tsne_results_02.max() - tsne_results_02.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 2000 samples in 0.001s... [t-SNE] Computed neighbors for 2000 samples in 0.062s... [t-SNE] Computed conditional probabilities for sample 1000 / 2000 [t-SNE] Computed conditional probabilities for sample 2000 / 2000 [t-SNE] Mean sigma: 1.543442 [t-SNE] KL divergence after 250 iterations with early exaggeration: 71.850433 [t-SNE] KL divergence after 300 iterations: 2.353606
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_02[:,0],tsne_results_02[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_02[:,0],tsne_results_02[:,1], c=y_valid_split[:2000], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_02):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 3¶
- CNN with 2 layers/max pooling layers
- 1 full-connected layer
- no regularization
Build CNN Model¶
k.clear_session()
model_03 = Sequential([
Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
MaxPool2D((2, 2),strides=2),
Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Flatten(),
Dense(units=384,activation=tf.nn.relu),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment3"] = {}
results["Experiment3"]["Architecture"] = "• CNN with 2 layers/max pooling layers\n • 1 full-connected layer\n • no regularization"
model_03.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 128) 3584
max_pooling2d (MaxPooling2 (None, 15, 15, 128) 0
D)
conv2d_1 (Conv2D) (None, 13, 13, 256) 295168
max_pooling2d_1 (MaxPoolin (None, 6, 6, 256) 0
g2D)
flatten (Flatten) (None, 9216) 0
dense (Dense) (None, 384) 3539328
dense_1 (Dense) (None, 10) 3850
=================================================================
Total params: 3841930 (14.66 MB)
Trainable params: 3841930 (14.66 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
keras.utils.plot_model(model_03, "CIFAR10_EXP_03.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_03.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_03 = model_03.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_03_2CNN_2DNN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment3"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200 704/704 [==============================] - 5s 5ms/step - loss: 1.4488 - accuracy: 0.4804 - val_loss: 1.3456 - val_accuracy: 0.5360 Epoch 2/200 704/704 [==============================] - 3s 4ms/step - loss: 1.0759 - accuracy: 0.6222 - val_loss: 1.0515 - val_accuracy: 0.6150 Epoch 3/200 704/704 [==============================] - 3s 4ms/step - loss: 0.9241 - accuracy: 0.6772 - val_loss: 1.0442 - val_accuracy: 0.6276 Epoch 4/200 704/704 [==============================] - 3s 4ms/step - loss: 0.8107 - accuracy: 0.7172 - val_loss: 0.9475 - val_accuracy: 0.6708 Epoch 5/200 704/704 [==============================] - 3s 4ms/step - loss: 0.7048 - accuracy: 0.7544 - val_loss: 0.9167 - val_accuracy: 0.6832 Epoch 6/200 704/704 [==============================] - 3s 4ms/step - loss: 0.6097 - accuracy: 0.7870 - val_loss: 0.8989 - val_accuracy: 0.6956 Epoch 7/200 704/704 [==============================] - 3s 4ms/step - loss: 0.5161 - accuracy: 0.8210 - val_loss: 0.9226 - val_accuracy: 0.7056 Epoch 8/200 704/704 [==============================] - 3s 4ms/step - loss: 0.4323 - accuracy: 0.8480 - val_loss: 0.9711 - val_accuracy: 0.6940 Epoch 9/200 704/704 [==============================] - 3s 4ms/step - loss: 0.3497 - accuracy: 0.8782 - val_loss: 0.9961 - val_accuracy: 0.7092 Epoch 10/200 704/704 [==============================] - 3s 4ms/step - loss: 0.2733 - accuracy: 0.9049 - val_loss: 1.1377 - val_accuracy: 0.6926 Epoch 11/200 704/704 [==============================] - 3s 4ms/step - loss: 0.2138 - accuracy: 0.9266 - val_loss: 1.2198 - val_accuracy: 0.6988 Epoch 12/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1752 - accuracy: 0.9385 - val_loss: 1.3672 - val_accuracy: 0.7056 Epoch 13/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1419 - accuracy: 0.9511 - val_loss: 1.5439 - val_accuracy: 0.6946 Epoch 14/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1142 - accuracy: 0.9616 - val_loss: 1.6647 - val_accuracy: 0.6838 Epoch 15/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1076 - accuracy: 0.9620 - val_loss: 1.7791 - val_accuracy: 0.6930 Epoch 16/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0886 - accuracy: 0.9692 - val_loss: 1.7916 - val_accuracy: 0.6912 Epoch 17/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0858 - accuracy: 0.9700 - val_loss: 1.8698 - val_accuracy: 0.6914 Epoch 18/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0740 - accuracy: 0.9744 - val_loss: 1.9822 - val_accuracy: 0.6862 Epoch 19/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0853 - accuracy: 0.9701 - val_loss: 2.0255 - val_accuracy: 0.6850 Time taken to train Model: 60.82 seconds
train_loss = history_03.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_03.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_03.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_03.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_03 = tf.keras.models.load_model("A2_Exp_03_2CNN_2DNN.h5")
test_loss, test_accuracy = model_03.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment3"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment3"]["Test Loss"] = round(test_loss,3)
results["Experiment3"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment3"]["Train Loss"] = round(train_loss,3)
results["Experiment3"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment3"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.085, Training Accuracy: 0.970 Validation Loss: 2.025, Validation Accuracy: 0.685 Test Loss: 0.865, Test Accuracy: 0.713
pred03 = model_03.predict(x_test_norm)
print('shape of preds: ', pred03.shape)
history_03_dict = history_03.history
history_03_df=pd.DataFrame(history_03_dict)
history_03_df.tail().round(3)
313/313 [==============================] - 0s 952us/step shape of preds: (10000, 10)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 14 | 0.108 | 0.962 | 1.779 | 0.693 |
| 15 | 0.089 | 0.969 | 1.792 | 0.691 |
| 16 | 0.086 | 0.970 | 1.870 | 0.691 |
| 17 | 0.074 | 0.974 | 1.982 | 0.686 |
| 18 | 0.085 | 0.970 | 2.025 | 0.685 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_03.history['accuracy'], history_03.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_03.history['loss'], history_03.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred03_cm=np.argmax(pred03, axis=1)
print_validation_report(y_test, pred03_cm)
Classification Report
precision recall f1-score support
0 0.74 0.79 0.77 1000
1 0.77 0.87 0.82 1000
2 0.64 0.56 0.60 1000
3 0.56 0.54 0.55 1000
4 0.65 0.67 0.66 1000
5 0.63 0.58 0.61 1000
6 0.66 0.88 0.75 1000
7 0.81 0.70 0.75 1000
8 0.85 0.80 0.82 1000
9 0.83 0.75 0.79 1000
accuracy 0.71 10000
macro avg 0.71 0.71 0.71 10000
weighted avg 0.71 0.71 0.71 10000
Accuracy Score: 0.7128
Root Mean Square Error: 2.2347930552961723
plot_confusion_matrix(y_test,pred03_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred03[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.16% | 0.15% | 0.55% | 83.54% | 0.11% | 11.89% | 3.19% | 0.01% | 0.38% | 0.01% |
| 1 | 0.16% | 11.03% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 88.78% | 0.03% |
| 2 | 2.38% | 10.27% | 0.04% | 0.33% | 0.45% | 0.08% | 0.03% | 0.21% | 83.92% | 2.28% |
| 3 | 55.12% | 3.38% | 6.45% | 0.80% | 9.03% | 0.01% | 0.12% | 0.19% | 24.88% | 0.02% |
| 4 | 0.00% | 0.00% | 0.37% | 2.77% | 59.21% | 0.20% | 37.45% | 0.00% | 0.00% | 0.00% |
| 5 | 0.02% | 0.01% | 2.15% | 1.03% | 1.19% | 4.60% | 90.99% | 0.01% | 0.00% | 0.01% |
| 6 | 0.11% | 91.51% | 0.00% | 0.12% | 0.00% | 2.78% | 0.04% | 0.00% | 0.00% | 5.43% |
| 7 | 3.91% | 0.23% | 44.25% | 2.92% | 4.65% | 1.07% | 42.18% | 0.24% | 0.23% | 0.33% |
| 8 | 0.23% | 0.04% | 22.25% | 54.10% | 4.97% | 11.04% | 4.12% | 3.21% | 0.01% | 0.03% |
| 9 | 0.60% | 98.57% | 0.01% | 0.00% | 0.01% | 0.00% | 0.02% | 0.00% | 0.04% | 0.75% |
| 10 | 78.58% | 0.15% | 0.84% | 1.81% | 17.57% | 0.45% | 0.08% | 0.18% | 0.21% | 0.14% |
| 11 | 0.00% | 0.16% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.84% |
| 12 | 0.11% | 1.77% | 13.25% | 10.30% | 3.56% | 56.92% | 9.99% | 0.22% | 3.85% | 0.02% |
| 13 | 0.02% | 0.00% | 0.00% | 0.00% | 0.02% | 0.02% | 0.00% | 99.94% | 0.00% | 0.00% |
| 14 | 0.01% | 6.16% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.14% | 93.69% |
| 15 | 0.82% | 0.47% | 0.05% | 0.46% | 0.09% | 0.14% | 55.77% | 0.00% | 42.21% | 0.00% |
| 16 | 0.00% | 0.08% | 0.06% | 6.27% | 0.00% | 92.59% | 0.02% | 0.96% | 0.00% | 0.00% |
| 17 | 0.77% | 0.05% | 1.49% | 24.14% | 1.27% | 28.00% | 1.74% | 40.95% | 1.15% | 0.44% |
| 18 | 0.08% | 0.95% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 98.71% | 0.25% |
| 19 | 0.00% | 0.01% | 0.08% | 0.17% | 0.03% | 0.03% | 99.67% | 0.00% | 0.00% | 0.00% |
layer_names = []
for layer in model_03.layers:
layer_names.append(layer.name)
print(layer_names)
# Extracts the outputs of the top 8 layers:
layer_outputs_03 = [layer.output for layer in model_03.layers[:7]]
# Creates a model that will return these outputs, given the model input:
activation_model_03 = tf.keras.models.Model(inputs=model_03.input, outputs=layer_outputs_03)
# Get activation values for the last dense layer
# activations_03 = activation_model_03.predict(x_valid_norm[:3250])
activations_03 = activation_model_03.predict(x_valid_norm[:1000])
dense_layer_activations_03 = activations_03[-3]
output_layer_activations_03 = activations_03[-1]
['conv2d', 'max_pooling2d', 'conv2d_1', 'max_pooling2d_1', 'flatten', 'dense', 'dense_1'] 32/32 [==============================] - 0s 2ms/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_03 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_03 = tsne_03.fit_transform(dense_layer_activations_03)
# Scaling
tsne_results_03 = (tsne_results_03 - tsne_results_03.min()) / (tsne_results_03.max() - tsne_results_03.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1000 samples in 0.002s... [t-SNE] Computed neighbors for 1000 samples in 0.245s... [t-SNE] Computed conditional probabilities for sample 1000 / 1000 [t-SNE] Mean sigma: 2.503065 [t-SNE] KL divergence after 250 iterations with early exaggeration: 67.902954 [t-SNE] KL divergence after 300 iterations: 1.957590
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_03[:,0],tsne_results_03[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_03[:,0],tsne_results_03[:,1], c=y_valid_split[:1000], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_03):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Result2:¶
Take Experiment 3 – Extract the outputs from 2 filters from the 2 max pooling layers and visualize them in a grid as images. See whether the ‘lighted’ up regions correspond to some features in the original images.
(_,_), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data()
img = test_images[2004]
img_tensor = image.img_to_array(img)
img_tensor = np.expand_dims(img_tensor, axis=0)
class_names = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck']
plt.imshow(img, cmap='viridis')
plt.axis('off')
plt.show()
activations_cnn_03 = activation_model_03.predict(img_tensor)
len(activations_cnn_03)
1/1 [==============================] - 0s 67ms/step
7
layer_names = []
for layer in model_03.layers:
layer_names.append(layer.name)
layer_names
['conv2d', 'max_pooling2d', 'conv2d_1', 'max_pooling2d_1', 'flatten', 'dense', 'dense_1']
# These are the names of the layers, so can have them as part of our plot
layer_names = []
for layer in model_03.layers[:3]:
layer_names.append(layer.name)
images_per_row = 16
# Now let's display our feature maps
for layer_name, layer_activation in zip(layer_names, activations_cnn_03):
# This is the number of features in the feature map
n_features = layer_activation.shape[-1]
# The feature map has shape (1, size, size, n_features)
size = layer_activation.shape[1]
# We will tile the activation channels in this matrix
n_cols = n_features // images_per_row
display_grid = np.zeros((size * n_cols, images_per_row * size))
# We'll tile each filter into this big horizontal grid
for col in range(n_cols):
for row in range(images_per_row):
channel_image = layer_activation[0,
:, :,
col * images_per_row + row]
# Post-process the feature to make it visually palatable
channel_image -= channel_image.mean()
channel_image /= channel_image.std()
channel_image *= 64
channel_image += 128
channel_image = np.clip(channel_image, 0, 255).astype('uint8')
display_grid[col * size : (col + 1) * size,
row * size : (row + 1) * size] = channel_image
# Display the grid
scale = 1. / size
plt.figure(figsize=(scale * display_grid.shape[1],
scale * display_grid.shape[0]))
plt.title(layer_name)
plt.grid(False)
plt.imshow(display_grid, aspect='auto', cmap='viridis')
plt.show();
Experiment 4¶
- CNN with 3 layers/max pooling layers
- 1 fully-connected layer
- no regularization
Build CNN Model¶
k.clear_session()
model_04 = Sequential([
Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
MaxPool2D((2, 2),strides=2),
Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Flatten(),
Dense(units=384,activation=tf.nn.relu),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment4"] = {}
results["Experiment4"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • 1 full-connected layer\n • no regularization"
model_04.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 128) 3584
max_pooling2d (MaxPooling2 (None, 15, 15, 128) 0
D)
conv2d_1 (Conv2D) (None, 13, 13, 256) 295168
max_pooling2d_1 (MaxPoolin (None, 6, 6, 256) 0
g2D)
conv2d_2 (Conv2D) (None, 4, 4, 512) 1180160
max_pooling2d_2 (MaxPoolin (None, 2, 2, 512) 0
g2D)
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 384) 786816
dense_1 (Dense) (None, 10) 3850
=================================================================
Total params: 2269578 (8.66 MB)
Trainable params: 2269578 (8.66 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
keras.utils.plot_model(model_04, "CIFAR10_EXP_04.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_04.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_04 = model_04.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_04_3CNN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment4"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200 704/704 [==============================] - 5s 5ms/step - loss: 1.4767 - accuracy: 0.4625 - val_loss: 1.2243 - val_accuracy: 0.5536 Epoch 2/200 704/704 [==============================] - 3s 4ms/step - loss: 1.0333 - accuracy: 0.6361 - val_loss: 0.9642 - val_accuracy: 0.6586 Epoch 3/200 704/704 [==============================] - 3s 4ms/step - loss: 0.8341 - accuracy: 0.7078 - val_loss: 0.9427 - val_accuracy: 0.6744 Epoch 4/200 704/704 [==============================] - 3s 4ms/step - loss: 0.6947 - accuracy: 0.7584 - val_loss: 0.8242 - val_accuracy: 0.7150 Epoch 5/200 704/704 [==============================] - 3s 4ms/step - loss: 0.5756 - accuracy: 0.7968 - val_loss: 0.8089 - val_accuracy: 0.7252 Epoch 6/200 704/704 [==============================] - 3s 4ms/step - loss: 0.4726 - accuracy: 0.8346 - val_loss: 0.8448 - val_accuracy: 0.7164 Epoch 7/200 704/704 [==============================] - 3s 4ms/step - loss: 0.3756 - accuracy: 0.8683 - val_loss: 0.8963 - val_accuracy: 0.7264 Epoch 8/200 704/704 [==============================] - 3s 4ms/step - loss: 0.2996 - accuracy: 0.8940 - val_loss: 0.9447 - val_accuracy: 0.7218 Epoch 9/200 704/704 [==============================] - 3s 4ms/step - loss: 0.2338 - accuracy: 0.9171 - val_loss: 0.9933 - val_accuracy: 0.7356 Epoch 10/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1809 - accuracy: 0.9363 - val_loss: 1.1054 - val_accuracy: 0.7296 Epoch 11/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1465 - accuracy: 0.9491 - val_loss: 1.2160 - val_accuracy: 0.7322 Epoch 12/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1395 - accuracy: 0.9502 - val_loss: 1.2887 - val_accuracy: 0.7386 Epoch 13/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1124 - accuracy: 0.9593 - val_loss: 1.3953 - val_accuracy: 0.7294 Epoch 14/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1113 - accuracy: 0.9627 - val_loss: 1.3615 - val_accuracy: 0.7360 Epoch 15/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0964 - accuracy: 0.9661 - val_loss: 1.4959 - val_accuracy: 0.7182 Epoch 16/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0878 - accuracy: 0.9696 - val_loss: 1.6343 - val_accuracy: 0.7300 Epoch 17/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0906 - accuracy: 0.9686 - val_loss: 1.6486 - val_accuracy: 0.7308 Epoch 18/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0759 - accuracy: 0.9735 - val_loss: 1.7013 - val_accuracy: 0.7388 Epoch 19/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0865 - accuracy: 0.9709 - val_loss: 1.8354 - val_accuracy: 0.7174 Epoch 20/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0712 - accuracy: 0.9756 - val_loss: 1.7544 - val_accuracy: 0.7198 Epoch 21/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0773 - accuracy: 0.9738 - val_loss: 1.7912 - val_accuracy: 0.7376 Epoch 22/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0713 - accuracy: 0.9757 - val_loss: 1.9533 - val_accuracy: 0.7162 Epoch 23/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0631 - accuracy: 0.9785 - val_loss: 2.0287 - val_accuracy: 0.7156 Epoch 24/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0663 - accuracy: 0.9779 - val_loss: 1.8894 - val_accuracy: 0.7240 Epoch 25/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0607 - accuracy: 0.9790 - val_loss: 1.9877 - val_accuracy: 0.7364 Epoch 26/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0596 - accuracy: 0.9802 - val_loss: 2.0521 - val_accuracy: 0.7298 Epoch 27/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0743 - accuracy: 0.9764 - val_loss: 2.1023 - val_accuracy: 0.7310 Epoch 28/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0585 - accuracy: 0.9808 - val_loss: 1.9562 - val_accuracy: 0.7424 Epoch 29/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0570 - accuracy: 0.9818 - val_loss: 2.0703 - val_accuracy: 0.7324 Epoch 30/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0555 - accuracy: 0.9825 - val_loss: 2.1394 - val_accuracy: 0.7318 Epoch 31/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0687 - accuracy: 0.9777 - val_loss: 2.1082 - val_accuracy: 0.7262 Epoch 32/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0583 - accuracy: 0.9816 - val_loss: 2.0926 - val_accuracy: 0.7282 Epoch 33/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0581 - accuracy: 0.9808 - val_loss: 2.0278 - val_accuracy: 0.7218 Epoch 34/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0513 - accuracy: 0.9838 - val_loss: 2.1546 - val_accuracy: 0.7278 Epoch 35/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0498 - accuracy: 0.9839 - val_loss: 2.2408 - val_accuracy: 0.7178 Epoch 36/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0507 - accuracy: 0.9845 - val_loss: 2.3039 - val_accuracy: 0.7204 Epoch 37/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0573 - accuracy: 0.9814 - val_loss: 2.1244 - val_accuracy: 0.7278 Epoch 38/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0477 - accuracy: 0.9847 - val_loss: 2.3224 - val_accuracy: 0.7228 Time taken to train Model: 119.00 seconds
train_loss = history_04.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_04.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_04.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_04.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_04 = tf.keras.models.load_model("A2_Exp_04_3CNN.h5")
test_loss, test_accuracy = model_04.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment4"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment4"]["Test Loss"] = round(test_loss,3)
results["Experiment4"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment4"]["Train Loss"] = round(train_loss,3)
results["Experiment4"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment4"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.048, Training Accuracy: 0.985 Validation Loss: 2.322, Validation Accuracy: 0.723 Test Loss: 0.794, Test Accuracy: 0.736
pred04 = model_04.predict(x_test_norm)
print('shape of preds: ', pred04.shape)
history_04_dict = history_04.history
history_04_df=pd.DataFrame(history_04_dict)
history_04_df.tail().round(3)
313/313 [==============================] - 0s 1ms/step shape of preds: (10000, 10)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 33 | 0.051 | 0.984 | 2.155 | 0.728 |
| 34 | 0.050 | 0.984 | 2.241 | 0.718 |
| 35 | 0.051 | 0.984 | 2.304 | 0.720 |
| 36 | 0.057 | 0.981 | 2.124 | 0.728 |
| 37 | 0.048 | 0.985 | 2.322 | 0.723 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_04.history['accuracy'], history_04.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_04.history['loss'], history_04.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred04_cm=np.argmax(pred04, axis=1)
print_validation_report(y_test, pred04_cm)
Classification Report
precision recall f1-score support
0 0.75 0.77 0.76 1000
1 0.91 0.82 0.86 1000
2 0.64 0.65 0.65 1000
3 0.67 0.44 0.53 1000
4 0.64 0.76 0.69 1000
5 0.66 0.66 0.66 1000
6 0.79 0.79 0.79 1000
7 0.80 0.75 0.77 1000
8 0.72 0.91 0.80 1000
9 0.80 0.82 0.81 1000
accuracy 0.74 10000
macro avg 0.74 0.74 0.73 10000
weighted avg 0.74 0.74 0.73 10000
Accuracy Score: 0.736
Root Mean Square Error: 2.1444113411377024
plot_confusion_matrix(y_test,pred04_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred04[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.03% | 0.01% | 0.05% | 91.71% | 0.27% | 3.54% | 2.66% | 0.02% | 1.70% | 0.01% |
| 1 | 2.28% | 0.14% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 97.58% | 0.01% |
| 2 | 11.29% | 3.51% | 0.26% | 0.50% | 0.05% | 0.07% | 0.05% | 0.23% | 80.51% | 3.55% |
| 3 | 93.39% | 0.63% | 0.33% | 0.04% | 0.15% | 0.00% | 0.04% | 0.01% | 5.32% | 0.09% |
| 4 | 0.00% | 0.00% | 0.41% | 0.15% | 95.40% | 0.01% | 4.03% | 0.00% | 0.00% | 0.00% |
| 5 | 0.01% | 0.02% | 0.65% | 0.51% | 0.67% | 2.41% | 95.29% | 0.12% | 0.31% | 0.02% |
| 6 | 0.20% | 3.41% | 0.13% | 0.98% | 0.00% | 0.36% | 0.12% | 0.04% | 0.22% | 94.54% |
| 7 | 0.71% | 0.05% | 12.72% | 5.50% | 7.49% | 3.36% | 69.84% | 0.06% | 0.11% | 0.16% |
| 8 | 0.20% | 0.06% | 2.11% | 70.86% | 8.08% | 11.48% | 3.65% | 3.32% | 0.09% | 0.15% |
| 9 | 0.62% | 69.43% | 0.18% | 0.01% | 0.00% | 0.01% | 0.63% | 0.00% | 0.98% | 28.12% |
| 10 | 13.42% | 0.01% | 10.04% | 5.63% | 55.98% | 5.28% | 0.12% | 4.54% | 4.79% | 0.21% |
| 11 | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 100.00% |
| 12 | 0.02% | 0.12% | 14.71% | 7.28% | 28.53% | 43.26% | 3.36% | 2.56% | 0.14% | 0.01% |
| 13 | 0.00% | 0.00% | 0.00% | 0.01% | 0.15% | 0.34% | 0.00% | 99.50% | 0.00% | 0.00% |
| 14 | 0.00% | 0.02% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.98% |
| 15 | 5.27% | 1.11% | 0.61% | 1.61% | 1.57% | 0.25% | 23.30% | 0.00% | 66.18% | 0.10% |
| 16 | 0.01% | 0.07% | 0.34% | 37.65% | 0.04% | 56.56% | 0.09% | 4.82% | 0.10% | 0.32% |
| 17 | 0.78% | 0.02% | 8.15% | 12.73% | 23.52% | 13.49% | 1.40% | 38.95% | 0.54% | 0.43% |
| 18 | 0.01% | 0.02% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.95% | 0.01% |
| 19 | 0.00% | 0.60% | 0.15% | 0.40% | 0.35% | 0.17% | 97.85% | 0.45% | 0.00% | 0.02% |
layer_names = []
for layer in model_04.layers:
layer_names.append(layer.name)
layer_names
['conv2d', 'max_pooling2d', 'conv2d_1', 'max_pooling2d_1', 'conv2d_2', 'max_pooling2d_2', 'flatten', 'dense', 'dense_1']
# Extracts the outputs of the top 11 layers:
layer_outputs_04 = [layer.output for layer in model_04.layers[:13]]
# Creates a model that will return these outputs, given the model input:
activation_model_04 = tf.keras.models.Model(inputs=model_04.input, outputs=layer_outputs_04)
# Get activation values for the last dense layer
# activations_04 = activation_model_04.predict(x_valid_norm[:3250])
activations_04 = activation_model_04.predict(x_valid_norm[:1000])
dense_layer_activations_04 = activations_04[-3]
output_layer_activations_04 = activations_04[-1]
32/32 [==============================] - 0s 1ms/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_04 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_04 = tsne_04.fit_transform(dense_layer_activations_04)
# Scaling
tsne_results_04 = (tsne_results_04 - tsne_results_04.min()) / (tsne_results_04.max() - tsne_results_04.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1000 samples in 0.001s... [t-SNE] Computed neighbors for 1000 samples in 0.057s... [t-SNE] Computed conditional probabilities for sample 1000 / 1000 [t-SNE] Mean sigma: 1.816506 [t-SNE] KL divergence after 250 iterations with early exaggeration: 64.036034 [t-SNE] KL divergence after 300 iterations: 1.867316
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_04[:,0],tsne_results_04[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_04[:,0],tsne_results_04[:,1], c=y_valid_split[:1000], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_04):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 5¶
- DNN with 2 layers (384, 768)
- Batch Normalization
- L2 Regularization(0.001)
Build CNN Model¶
k.clear_session()
model_05 = Sequential([
Flatten(input_shape=x_train_norm.shape[1:]),
Dense(units=384,activation=tf.nn.relu),
# Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
BatchNormalization(),
# Dropout(0.3),
Dense(units=768,activation=tf.nn.relu),
# Dense(units=768,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
BatchNormalization(),
# Dropout(0.3),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment5"] = {}
results["Experiment5"]["Architecture"] = "• DNN with 2 layers (384, 768)\n • Batch Normalization\n • L2 Regularization(0.001)"
model_05.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 3072) 0
dense (Dense) (None, 384) 1180032
batch_normalization (Batch (None, 384) 1536
Normalization)
dense_1 (Dense) (None, 768) 295680
batch_normalization_1 (Bat (None, 768) 3072
chNormalization)
dense_2 (Dense) (None, 10) 7690
=================================================================
Total params: 1488010 (5.68 MB)
Trainable params: 1485706 (5.67 MB)
Non-trainable params: 2304 (9.00 KB)
_________________________________________________________________
keras.utils.plot_model(model_05, "CIFAR10_EXP_05.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_05.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_05 = model_05.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_05_2DNN_BN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=7),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment5"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200 704/704 [==============================] - 4s 3ms/step - loss: 1.7934 - accuracy: 0.3757 - val_loss: 1.8845 - val_accuracy: 0.3346 Epoch 2/200 704/704 [==============================] - 2s 3ms/step - loss: 1.6300 - accuracy: 0.4255 - val_loss: 1.9138 - val_accuracy: 0.3520 Epoch 3/200 704/704 [==============================] - 2s 3ms/step - loss: 1.5344 - accuracy: 0.4576 - val_loss: 1.7508 - val_accuracy: 0.3996 Epoch 4/200 704/704 [==============================] - 2s 3ms/step - loss: 1.4796 - accuracy: 0.4782 - val_loss: 1.8204 - val_accuracy: 0.4072 Epoch 5/200 704/704 [==============================] - 2s 3ms/step - loss: 1.4278 - accuracy: 0.4950 - val_loss: 1.7341 - val_accuracy: 0.4230 Epoch 6/200 704/704 [==============================] - 2s 3ms/step - loss: 1.3750 - accuracy: 0.5170 - val_loss: 1.7756 - val_accuracy: 0.4160 Epoch 7/200 704/704 [==============================] - 2s 3ms/step - loss: 1.3320 - accuracy: 0.5323 - val_loss: 1.5763 - val_accuracy: 0.4724 Epoch 8/200 704/704 [==============================] - 2s 3ms/step - loss: 1.2988 - accuracy: 0.5404 - val_loss: 1.7172 - val_accuracy: 0.4260 Epoch 9/200 704/704 [==============================] - 2s 3ms/step - loss: 1.2566 - accuracy: 0.5573 - val_loss: 1.5500 - val_accuracy: 0.4836 Epoch 10/200 704/704 [==============================] - 2s 3ms/step - loss: 1.2227 - accuracy: 0.5688 - val_loss: 1.6693 - val_accuracy: 0.4416 Epoch 11/200 704/704 [==============================] - 2s 3ms/step - loss: 1.1923 - accuracy: 0.5769 - val_loss: 1.6718 - val_accuracy: 0.4568 Epoch 12/200 704/704 [==============================] - 2s 3ms/step - loss: 1.1680 - accuracy: 0.5863 - val_loss: 1.6790 - val_accuracy: 0.4488 Epoch 13/200 704/704 [==============================] - 2s 3ms/step - loss: 1.1472 - accuracy: 0.5940 - val_loss: 1.5340 - val_accuracy: 0.4858 Epoch 14/200 704/704 [==============================] - 2s 3ms/step - loss: 1.1163 - accuracy: 0.6065 - val_loss: 1.8078 - val_accuracy: 0.4478 Epoch 15/200 704/704 [==============================] - 2s 3ms/step - loss: 1.0951 - accuracy: 0.6126 - val_loss: 1.5854 - val_accuracy: 0.4812 Epoch 16/200 704/704 [==============================] - 2s 3ms/step - loss: 1.0763 - accuracy: 0.6208 - val_loss: 1.6121 - val_accuracy: 0.4808 Epoch 17/200 704/704 [==============================] - 2s 3ms/step - loss: 1.0528 - accuracy: 0.6260 - val_loss: 1.6193 - val_accuracy: 0.4784 Epoch 18/200 704/704 [==============================] - 2s 3ms/step - loss: 1.0291 - accuracy: 0.6348 - val_loss: 1.7509 - val_accuracy: 0.4656 Epoch 19/200 704/704 [==============================] - 2s 3ms/step - loss: 1.0133 - accuracy: 0.6390 - val_loss: 1.6757 - val_accuracy: 0.4548 Epoch 20/200 704/704 [==============================] - 2s 3ms/step - loss: 0.9909 - accuracy: 0.6495 - val_loss: 1.6374 - val_accuracy: 0.4772 Time taken to train Model: 41.89 seconds
train_loss = history_05.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_05.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_05.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_05.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_05 = tf.keras.models.load_model("A2_Exp_05_2DNN_BN.h5")
test_loss, test_accuracy = model_05.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment5"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment5"]["Test Loss"] = round(test_loss,3)
results["Experiment5"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment5"]["Train Loss"] = round(train_loss,3)
results["Experiment5"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment5"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.991, Training Accuracy: 0.650 Validation Loss: 1.637, Validation Accuracy: 0.477 Test Loss: 1.482, Test Accuracy: 0.490
pred05 = model_05.predict(x_test_norm)
print('shape of preds: ', pred05.shape)
313/313 [==============================] - 0s 938us/step shape of preds: (10000, 10)
history_05_dict = history_05.history
history_05_dict.keys()
history_05_df=pd.DataFrame(history_05_dict)
history_05_df.tail().round(3)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 15 | 1.076 | 0.621 | 1.612 | 0.481 |
| 16 | 1.053 | 0.626 | 1.619 | 0.478 |
| 17 | 1.029 | 0.635 | 1.751 | 0.466 |
| 18 | 1.013 | 0.639 | 1.676 | 0.455 |
| 19 | 0.991 | 0.650 | 1.637 | 0.477 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_05.history['accuracy'], history_05.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_05.history['loss'], history_05.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred05_cm=np.argmax(pred05, axis=1)
print_validation_report(y_test, pred05_cm)
Classification Report
precision recall f1-score support
0 0.67 0.31 0.43 1000
1 0.69 0.46 0.55 1000
2 0.35 0.45 0.40 1000
3 0.42 0.13 0.20 1000
4 0.43 0.44 0.43 1000
5 0.41 0.49 0.44 1000
6 0.52 0.60 0.56 1000
7 0.54 0.60 0.56 1000
8 0.59 0.69 0.64 1000
9 0.44 0.73 0.55 1000
accuracy 0.49 10000
macro avg 0.51 0.49 0.48 10000
weighted avg 0.51 0.49 0.48 10000
Accuracy Score: 0.4898
Root Mean Square Error: 3.205604467179318
plot_confusion_matrix(y_test,pred05_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred05[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.38% | 9.75% | 4.75% | 43.39% | 5.04% | 32.23% | 0.92% | 0.00% | 2.09% | 1.46% |
| 1 | 0.10% | 1.47% | 0.00% | 0.00% | 0.01% | 0.00% | 0.00% | 0.04% | 73.42% | 24.95% |
| 2 | 16.85% | 7.01% | 4.47% | 0.04% | 0.32% | 0.02% | 0.05% | 8.82% | 37.07% | 25.34% |
| 3 | 4.95% | 0.49% | 7.25% | 1.18% | 14.89% | 1.15% | 0.02% | 46.92% | 22.48% | 0.66% |
| 4 | 0.02% | 0.03% | 3.28% | 0.61% | 40.27% | 6.05% | 49.31% | 0.09% | 0.31% | 0.02% |
| 5 | 1.02% | 0.76% | 3.89% | 12.78% | 10.28% | 13.96% | 47.91% | 1.69% | 0.06% | 7.63% |
| 6 | 0.32% | 96.28% | 0.04% | 1.93% | 0.00% | 1.21% | 0.06% | 0.04% | 0.03% | 0.09% |
| 7 | 0.85% | 0.95% | 6.63% | 0.41% | 0.87% | 0.12% | 86.49% | 0.01% | 0.02% | 3.65% |
| 8 | 1.01% | 0.03% | 23.63% | 1.62% | 59.05% | 12.16% | 1.01% | 0.96% | 0.44% | 0.07% |
| 9 | 0.12% | 72.50% | 0.07% | 0.08% | 0.24% | 0.08% | 0.02% | 0.06% | 1.13% | 25.71% |
| 10 | 13.08% | 0.16% | 10.60% | 0.36% | 1.27% | 2.20% | 0.57% | 0.03% | 71.68% | 0.04% |
| 11 | 0.01% | 4.14% | 0.00% | 0.12% | 0.04% | 0.02% | 0.04% | 0.11% | 0.14% | 95.37% |
| 12 | 0.12% | 2.57% | 18.14% | 8.48% | 4.32% | 42.06% | 19.82% | 3.36% | 0.49% | 0.63% |
| 13 | 19.86% | 2.73% | 0.35% | 0.06% | 0.15% | 0.47% | 0.23% | 73.28% | 2.19% | 0.68% |
| 14 | 0.02% | 17.32% | 2.60% | 0.92% | 0.01% | 2.00% | 0.24% | 0.23% | 0.02% | 76.65% |
| 15 | 6.00% | 0.04% | 1.93% | 6.35% | 11.71% | 30.81% | 6.95% | 2.44% | 33.72% | 0.04% |
| 16 | 6.53% | 10.46% | 11.06% | 20.27% | 0.06% | 25.95% | 0.22% | 17.38% | 0.08% | 8.00% |
| 17 | 7.17% | 0.38% | 11.43% | 8.87% | 9.74% | 11.63% | 3.44% | 25.84% | 0.94% | 20.58% |
| 18 | 0.08% | 0.30% | 0.01% | 0.03% | 0.16% | 0.01% | 0.00% | 0.10% | 99.01% | 0.30% |
| 19 | 0.33% | 2.80% | 9.26% | 5.52% | 0.37% | 31.39% | 23.20% | 24.09% | 0.06% | 3.00% |
layer_names = []
for layer in model_05.layers:
layer_names.append(layer.name)
layer_names
['flatten', 'dense', 'batch_normalization', 'dense_1', 'batch_normalization_1', 'dense_2']
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model_05.layers[:6]]
# Creates a model that will return these outputs, given the model input:
activation_model_05 = tf.keras.models.Model(inputs=model_05.input, outputs=layer_outputs)
# Get activation values for the last dense layer
# activations_05 = activation_model_05.predict(x_valid_norm[:3250])
activations_05 = activation_model_05.predict(x_valid_norm[:1500])
dense_layer_activations_05 = activations_05[-3]
output_layer_activations_05 = activations_05[-1]
47/47 [==============================] - 0s 943us/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_05 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_05 = tsne_05.fit_transform(dense_layer_activations_05)
# Scaling
tsne_results_05 = (tsne_results_05 - tsne_results_05.min()) / (tsne_results_05.max() - tsne_results_05.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1500 samples in 0.001s... [t-SNE] Computed neighbors for 1500 samples in 0.033s... [t-SNE] Computed conditional probabilities for sample 1000 / 1500 [t-SNE] Computed conditional probabilities for sample 1500 / 1500 [t-SNE] Mean sigma: 4.815121 [t-SNE] KL divergence after 250 iterations with early exaggeration: 65.247437 [t-SNE] KL divergence after 300 iterations: 1.694011
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
scatter = plt.scatter(tsne_results_05[:,0],tsne_results_05[:,1], c=y_valid_split[:1500], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_05):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 6¶
- DNN with 3 layers
- Regularization: batch normalization
Build CNN Model¶
k.clear_session()
model_06 = Sequential([
Flatten(input_shape=x_train_norm.shape[1:]),
Dense(units=384,activation=tf.nn.relu),
BatchNormalization(),
Dense(units=768,activation=tf.nn.relu),
BatchNormalization(),
Dense(units=1536,activation=tf.nn.relu),
BatchNormalization(),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment6"] = {}
results["Experiment6"]["Architecture"] = "• DNN with 3 layers\n • Regularization: batch normalization"
model_06.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 3072) 0
dense (Dense) (None, 384) 1180032
batch_normalization (Batch (None, 384) 1536
Normalization)
dense_1 (Dense) (None, 768) 295680
batch_normalization_1 (Bat (None, 768) 3072
chNormalization)
dense_2 (Dense) (None, 1536) 1181184
batch_normalization_2 (Bat (None, 1536) 6144
chNormalization)
dense_3 (Dense) (None, 10) 15370
=================================================================
Total params: 2683018 (10.23 MB)
Trainable params: 2677642 (10.21 MB)
Non-trainable params: 5376 (21.00 KB)
_________________________________________________________________
keras.utils.plot_model(model_06, "CIFAR10_EXP_06.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_06.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_06 = model_06.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_06_3DNN_BN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=7),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment6"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200 704/704 [==============================] - 4s 4ms/step - loss: 1.8833 - accuracy: 0.3563 - val_loss: 1.9355 - val_accuracy: 0.3538 Epoch 2/200 704/704 [==============================] - 2s 3ms/step - loss: 1.6770 - accuracy: 0.4168 - val_loss: 2.0260 - val_accuracy: 0.3426 Epoch 3/200 704/704 [==============================] - 2s 3ms/step - loss: 1.5890 - accuracy: 0.4464 - val_loss: 2.0122 - val_accuracy: 0.3434 Epoch 4/200 704/704 [==============================] - 2s 3ms/step - loss: 1.5105 - accuracy: 0.4751 - val_loss: 2.0802 - val_accuracy: 0.3500 Epoch 5/200 704/704 [==============================] - 2s 3ms/step - loss: 1.4396 - accuracy: 0.4984 - val_loss: 1.7454 - val_accuracy: 0.3938 Epoch 6/200 704/704 [==============================] - 2s 3ms/step - loss: 1.3771 - accuracy: 0.5208 - val_loss: 1.5798 - val_accuracy: 0.4652 Epoch 7/200 704/704 [==============================] - 2s 3ms/step - loss: 1.3153 - accuracy: 0.5418 - val_loss: 1.6387 - val_accuracy: 0.4338 Epoch 8/200 704/704 [==============================] - 2s 3ms/step - loss: 1.2693 - accuracy: 0.5574 - val_loss: 1.5263 - val_accuracy: 0.4718 Epoch 9/200 704/704 [==============================] - 2s 3ms/step - loss: 1.2122 - accuracy: 0.5766 - val_loss: 1.6066 - val_accuracy: 0.4688 Epoch 10/200 704/704 [==============================] - 2s 3ms/step - loss: 1.1561 - accuracy: 0.5907 - val_loss: 1.5726 - val_accuracy: 0.4594 Epoch 11/200 704/704 [==============================] - 2s 3ms/step - loss: 1.1064 - accuracy: 0.6077 - val_loss: 1.6209 - val_accuracy: 0.4794 Epoch 12/200 704/704 [==============================] - 2s 3ms/step - loss: 1.0558 - accuracy: 0.6227 - val_loss: 1.6910 - val_accuracy: 0.4632 Epoch 13/200 704/704 [==============================] - 2s 3ms/step - loss: 1.0112 - accuracy: 0.6379 - val_loss: 1.8966 - val_accuracy: 0.4436 Epoch 14/200 704/704 [==============================] - 2s 3ms/step - loss: 0.9557 - accuracy: 0.6568 - val_loss: 1.6506 - val_accuracy: 0.4906 Epoch 15/200 704/704 [==============================] - 2s 3ms/step - loss: 0.9105 - accuracy: 0.6745 - val_loss: 1.8542 - val_accuracy: 0.4586 Epoch 16/200 704/704 [==============================] - 2s 3ms/step - loss: 0.8585 - accuracy: 0.6913 - val_loss: 1.8315 - val_accuracy: 0.4930 Epoch 17/200 704/704 [==============================] - 2s 3ms/step - loss: 0.8103 - accuracy: 0.7072 - val_loss: 1.8081 - val_accuracy: 0.4740 Epoch 18/200 704/704 [==============================] - 2s 3ms/step - loss: 0.7768 - accuracy: 0.7217 - val_loss: 1.9738 - val_accuracy: 0.4674 Epoch 19/200 704/704 [==============================] - 2s 3ms/step - loss: 0.7420 - accuracy: 0.7332 - val_loss: 37.4505 - val_accuracy: 0.4478 Epoch 20/200 704/704 [==============================] - 2s 3ms/step - loss: 0.6955 - accuracy: 0.7502 - val_loss: 4.8527 - val_accuracy: 0.4738 Epoch 21/200 704/704 [==============================] - 2s 3ms/step - loss: 0.6527 - accuracy: 0.7657 - val_loss: 2.3160 - val_accuracy: 0.4616 Epoch 22/200 704/704 [==============================] - 2s 3ms/step - loss: 0.6247 - accuracy: 0.7728 - val_loss: 2.4464 - val_accuracy: 0.4706 Epoch 23/200 704/704 [==============================] - 2s 3ms/step - loss: 0.5808 - accuracy: 0.7917 - val_loss: 2.4125 - val_accuracy: 0.4782 Time taken to train Model: 56.52 seconds
train_loss = history_06.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_06.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_06.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_06.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_06 = tf.keras.models.load_model("A2_Exp_06_3DNN_BN.h5")
test_loss, test_accuracy = model_06.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment6"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment6"]["Test Loss"] = round(test_loss,3)
results["Experiment6"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment6"]["Train Loss"] = round(train_loss,3)
results["Experiment6"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment6"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.581, Training Accuracy: 0.792 Validation Loss: 2.413, Validation Accuracy: 0.478 Test Loss: 1.480, Test Accuracy: 0.483
pred06 = model_06.predict(x_test_norm)
print('shape of preds: ', pred06.shape)
313/313 [==============================] - 0s 960us/step shape of preds: (10000, 10)
history_06_dict = history_06.history
history_06_df=pd.DataFrame(history_06_dict)
history_06_df.tail().round(3)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 18 | 0.742 | 0.733 | 37.450 | 0.448 |
| 19 | 0.695 | 0.750 | 4.853 | 0.474 |
| 20 | 0.653 | 0.766 | 2.316 | 0.462 |
| 21 | 0.625 | 0.773 | 2.446 | 0.471 |
| 22 | 0.581 | 0.792 | 2.413 | 0.478 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_06.history['accuracy'], history_06.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_06.history['loss'], history_06.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred06_cm=np.argmax(pred06, axis=1)
print_validation_report(y_test, pred06_cm)
Classification Report
precision recall f1-score support
0 0.57 0.45 0.50 1000
1 0.58 0.63 0.61 1000
2 0.32 0.51 0.40 1000
3 0.33 0.36 0.35 1000
4 0.51 0.25 0.33 1000
5 0.42 0.37 0.40 1000
6 0.61 0.43 0.50 1000
7 0.51 0.62 0.56 1000
8 0.66 0.57 0.62 1000
9 0.49 0.63 0.55 1000
accuracy 0.48 10000
macro avg 0.50 0.48 0.48 10000
weighted avg 0.50 0.48 0.48 10000
Accuracy Score: 0.4834
Root Mean Square Error: 3.1218103722039237
plot_confusion_matrix(y_test,pred06_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred06[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 7.66% | 2.98% | 5.13% | 50.61% | 11.89% | 15.55% | 0.75% | 2.57% | 1.59% | 1.27% |
| 1 | 4.73% | 25.66% | 1.40% | 0.43% | 0.18% | 0.07% | 0.08% | 0.87% | 32.65% | 33.93% |
| 2 | 26.09% | 24.59% | 1.21% | 0.39% | 0.34% | 0.39% | 0.05% | 0.27% | 34.50% | 12.17% |
| 3 | 10.73% | 2.26% | 26.29% | 3.68% | 7.94% | 1.76% | 0.10% | 20.67% | 21.73% | 4.82% |
| 4 | 0.36% | 0.01% | 14.09% | 1.07% | 51.07% | 12.86% | 19.84% | 0.63% | 0.05% | 0.01% |
| 5 | 1.45% | 0.14% | 5.52% | 16.95% | 2.42% | 6.65% | 61.91% | 4.68% | 0.10% | 0.18% |
| 6 | 1.71% | 33.93% | 2.43% | 45.51% | 0.02% | 3.89% | 0.41% | 3.36% | 1.74% | 7.00% |
| 7 | 2.94% | 0.38% | 9.10% | 1.52% | 2.10% | 5.03% | 77.75% | 0.14% | 0.03% | 0.99% |
| 8 | 0.86% | 0.27% | 47.46% | 10.93% | 10.09% | 14.66% | 0.27% | 14.41% | 0.64% | 0.41% |
| 9 | 1.45% | 44.98% | 3.11% | 1.51% | 0.24% | 0.13% | 0.32% | 0.14% | 2.90% | 45.23% |
| 10 | 14.39% | 1.11% | 1.99% | 7.04% | 1.01% | 9.98% | 1.56% | 0.22% | 62.44% | 0.25% |
| 11 | 0.41% | 35.34% | 0.32% | 0.06% | 0.05% | 0.03% | 0.08% | 0.12% | 0.83% | 62.75% |
| 12 | 1.08% | 17.70% | 7.69% | 20.38% | 0.68% | 21.92% | 20.05% | 2.99% | 3.30% | 4.20% |
| 13 | 0.07% | 0.00% | 0.03% | 0.00% | 0.00% | 0.01% | 0.00% | 99.88% | 0.00% | 0.00% |
| 14 | 2.22% | 45.84% | 9.35% | 3.64% | 0.08% | 3.49% | 5.13% | 1.09% | 0.42% | 28.75% |
| 15 | 3.57% | 0.42% | 2.37% | 11.64% | 1.63% | 25.82% | 4.18% | 0.47% | 49.52% | 0.40% |
| 16 | 8.17% | 10.47% | 3.62% | 18.23% | 1.53% | 20.88% | 1.96% | 24.62% | 3.22% | 7.30% |
| 17 | 2.42% | 0.74% | 6.93% | 13.67% | 7.07% | 7.33% | 1.31% | 40.73% | 0.91% | 18.89% |
| 18 | 6.50% | 4.99% | 1.18% | 0.56% | 1.84% | 0.01% | 0.29% | 0.15% | 84.21% | 0.27% |
| 19 | 1.53% | 0.01% | 2.63% | 0.35% | 4.74% | 3.61% | 5.87% | 81.13% | 0.00% | 0.12% |
layer_names = []
for layer in model_06.layers:
layer_names.append(layer.name)
layer_names
['flatten', 'dense', 'batch_normalization', 'dense_1', 'batch_normalization_1', 'dense_2', 'batch_normalization_2', 'dense_3']
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model_06.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model_06 = tf.keras.models.Model(inputs=model_06.input, outputs=layer_outputs)
# Get activation values for the last dense layer
# activations_06 = activation_model_06.predict(x_valid_norm[:3250])
activations_06 = activation_model_06.predict(x_valid_norm[:1500])
dense_layer_activations_06 = activations_06[-3]
output_layer_activations_06 = activations_06[-1]
47/47 [==============================] - 0s 1ms/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_06 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_06 = tsne_06.fit_transform(dense_layer_activations_06)
# Scaling
tsne_results_06 = (tsne_results_06 - tsne_results_06.min()) / (tsne_results_06.max() - tsne_results_06.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1500 samples in 0.001s... [t-SNE] Computed neighbors for 1500 samples in 0.071s... [t-SNE] Computed conditional probabilities for sample 1000 / 1500 [t-SNE] Computed conditional probabilities for sample 1500 / 1500 [t-SNE] Mean sigma: 15.065521 [t-SNE] KL divergence after 250 iterations with early exaggeration: 66.312225 [t-SNE] KL divergence after 300 iterations: 1.695715
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_06[:,0],tsne_results_06[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_06[:,0],tsne_results_06[:,1], c=y_valid_split[:1500], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_06):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 7¶
- CNN with 2 layers/max pooling layers
- L2 Regularization(0.001)
Build CNN Model¶
k.clear_session()
model_07 = Sequential([
Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
MaxPool2D((2, 2),strides=2),
# Dropout(0.3),
Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
# Dropout(0.3),
Flatten(),
Dense(units=384,activation=tf.nn.relu),
# Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
BatchNormalization(),
# Dropout(0.3),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment7"] = {}
results["Experiment7"]["Architecture"] = "• CNN with 2 layers/max pooling layers\n • L2 Regularization(0.001)"
model_07.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 128) 3584
max_pooling2d (MaxPooling2 (None, 15, 15, 128) 0
D)
conv2d_1 (Conv2D) (None, 13, 13, 256) 295168
max_pooling2d_1 (MaxPoolin (None, 6, 6, 256) 0
g2D)
flatten (Flatten) (None, 9216) 0
dense (Dense) (None, 384) 3539328
batch_normalization (Batch (None, 384) 1536
Normalization)
dense_1 (Dense) (None, 10) 3850
=================================================================
Total params: 3843466 (14.66 MB)
Trainable params: 3842698 (14.66 MB)
Non-trainable params: 768 (3.00 KB)
_________________________________________________________________
keras.utils.plot_model(model_07, "CIFAR10_EXP_07.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_07.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_07 = model_07.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_07_2CNN_BN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment7"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200 704/704 [==============================] - 4s 5ms/step - loss: 1.2678 - accuracy: 0.5580 - val_loss: 1.4533 - val_accuracy: 0.5008 Epoch 2/200 704/704 [==============================] - 3s 5ms/step - loss: 0.9628 - accuracy: 0.6645 - val_loss: 2.1573 - val_accuracy: 0.4074 Epoch 3/200 704/704 [==============================] - 3s 5ms/step - loss: 0.8284 - accuracy: 0.7144 - val_loss: 1.2960 - val_accuracy: 0.5830 Epoch 4/200 704/704 [==============================] - 3s 5ms/step - loss: 0.7231 - accuracy: 0.7514 - val_loss: 1.0535 - val_accuracy: 0.6364 Epoch 5/200 704/704 [==============================] - 3s 5ms/step - loss: 0.6363 - accuracy: 0.7792 - val_loss: 0.9151 - val_accuracy: 0.6892 Epoch 6/200 704/704 [==============================] - 3s 5ms/step - loss: 0.5506 - accuracy: 0.8075 - val_loss: 0.9850 - val_accuracy: 0.6806 Epoch 7/200 704/704 [==============================] - 3s 4ms/step - loss: 0.4700 - accuracy: 0.8357 - val_loss: 0.9247 - val_accuracy: 0.7110 Epoch 8/200 704/704 [==============================] - 3s 5ms/step - loss: 0.3857 - accuracy: 0.8669 - val_loss: 1.0033 - val_accuracy: 0.6950 Epoch 9/200 704/704 [==============================] - 3s 5ms/step - loss: 0.3066 - accuracy: 0.8937 - val_loss: 1.0783 - val_accuracy: 0.7106 Epoch 10/200 704/704 [==============================] - 3s 4ms/step - loss: 0.2392 - accuracy: 0.9178 - val_loss: 1.1189 - val_accuracy: 0.7036 Epoch 11/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1933 - accuracy: 0.9340 - val_loss: 1.2544 - val_accuracy: 0.6988 Epoch 12/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1544 - accuracy: 0.9478 - val_loss: 1.5356 - val_accuracy: 0.6682 Epoch 13/200 704/704 [==============================] - 3s 5ms/step - loss: 0.1321 - accuracy: 0.9552 - val_loss: 1.3954 - val_accuracy: 0.7046 Epoch 14/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1036 - accuracy: 0.9654 - val_loss: 1.5925 - val_accuracy: 0.6998 Epoch 15/200 704/704 [==============================] - 3s 4ms/step - loss: 0.1136 - accuracy: 0.9605 - val_loss: 1.5677 - val_accuracy: 0.6912 Epoch 16/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0983 - accuracy: 0.9660 - val_loss: 1.4800 - val_accuracy: 0.7128 Epoch 17/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0740 - accuracy: 0.9751 - val_loss: 1.4750 - val_accuracy: 0.7092 Epoch 18/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0783 - accuracy: 0.9730 - val_loss: 1.5853 - val_accuracy: 0.7024 Epoch 19/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0621 - accuracy: 0.9796 - val_loss: 1.6739 - val_accuracy: 0.7036 Epoch 20/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0668 - accuracy: 0.9774 - val_loss: 1.7103 - val_accuracy: 0.7064 Epoch 21/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0628 - accuracy: 0.9784 - val_loss: 1.5978 - val_accuracy: 0.7134 Epoch 22/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0555 - accuracy: 0.9816 - val_loss: 1.7585 - val_accuracy: 0.6932 Epoch 23/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0600 - accuracy: 0.9794 - val_loss: 1.7080 - val_accuracy: 0.7148 Epoch 24/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0451 - accuracy: 0.9847 - val_loss: 1.7430 - val_accuracy: 0.6884 Epoch 25/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0916 - accuracy: 0.9700 - val_loss: 1.6268 - val_accuracy: 0.7106 Epoch 26/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0352 - accuracy: 0.9878 - val_loss: 1.8817 - val_accuracy: 0.7006 Epoch 27/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0521 - accuracy: 0.9823 - val_loss: 1.6954 - val_accuracy: 0.7194 Epoch 28/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0383 - accuracy: 0.9867 - val_loss: 2.0266 - val_accuracy: 0.6938 Epoch 29/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0397 - accuracy: 0.9864 - val_loss: 1.8967 - val_accuracy: 0.7092 Epoch 30/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0616 - accuracy: 0.9798 - val_loss: 1.6888 - val_accuracy: 0.7164 Epoch 31/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0246 - accuracy: 0.9916 - val_loss: 1.7348 - val_accuracy: 0.7182 Epoch 32/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0622 - accuracy: 0.9799 - val_loss: 1.8087 - val_accuracy: 0.7250 Epoch 33/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0197 - accuracy: 0.9932 - val_loss: 2.0883 - val_accuracy: 0.6954 Epoch 34/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0952 - accuracy: 0.9702 - val_loss: 1.6310 - val_accuracy: 0.7270 Epoch 35/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0188 - accuracy: 0.9941 - val_loss: 1.6175 - val_accuracy: 0.7330 Epoch 36/200 704/704 [==============================] - 3s 4ms/step - loss: 0.0202 - accuracy: 0.9938 - val_loss: 1.8467 - val_accuracy: 0.7132 Epoch 37/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0342 - accuracy: 0.9883 - val_loss: 1.8946 - val_accuracy: 0.7202 Epoch 38/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0282 - accuracy: 0.9905 - val_loss: 1.8807 - val_accuracy: 0.7212 Epoch 39/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0392 - accuracy: 0.9868 - val_loss: 1.9510 - val_accuracy: 0.6984 Epoch 40/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0466 - accuracy: 0.9847 - val_loss: 1.8118 - val_accuracy: 0.7240 Epoch 41/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0256 - accuracy: 0.9918 - val_loss: 1.7694 - val_accuracy: 0.7218 Epoch 42/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0151 - accuracy: 0.9952 - val_loss: 2.0283 - val_accuracy: 0.6974 Epoch 43/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0322 - accuracy: 0.9891 - val_loss: 2.1238 - val_accuracy: 0.7052 Epoch 44/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0379 - accuracy: 0.9876 - val_loss: 1.7790 - val_accuracy: 0.7208 Epoch 45/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0219 - accuracy: 0.9925 - val_loss: 1.8939 - val_accuracy: 0.7238 Time taken to train Model: 145.38 seconds
train_loss = history_07.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_07.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_07.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_07.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_07 = tf.keras.models.load_model("A2_Exp_07_2CNN_BN.h5")
test_loss, test_accuracy = model_07.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment7"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment7"]["Test Loss"] = round(test_loss,3)
results["Experiment7"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment7"]["Train Loss"] = round(train_loss,3)
results["Experiment7"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment7"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.022, Training Accuracy: 0.993 Validation Loss: 1.894, Validation Accuracy: 0.724 Test Loss: 0.907, Test Accuracy: 0.701
pred07 = model_07.predict(x_test_norm)
print('shape of preds: ', pred07.shape)
313/313 [==============================] - 0s 1ms/step shape of preds: (10000, 10)
history_07_dict = history_07.history
history_07_df=pd.DataFrame(history_07_dict)
history_07_df.tail().round(3)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 40 | 0.026 | 0.992 | 1.769 | 0.722 |
| 41 | 0.015 | 0.995 | 2.028 | 0.697 |
| 42 | 0.032 | 0.989 | 2.124 | 0.705 |
| 43 | 0.038 | 0.988 | 1.779 | 0.721 |
| 44 | 0.022 | 0.993 | 1.894 | 0.724 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_07.history['accuracy'], history_07.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_07.history['loss'], history_07.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred07_cm=np.argmax(pred07, axis=1)
print_validation_report(y_test, pred07_cm)
Classification Report
precision recall f1-score support
0 0.83 0.64 0.72 1000
1 0.86 0.82 0.84 1000
2 0.61 0.57 0.59 1000
3 0.42 0.72 0.53 1000
4 0.74 0.61 0.67 1000
5 0.63 0.61 0.62 1000
6 0.70 0.84 0.76 1000
7 0.88 0.64 0.74 1000
8 0.86 0.77 0.81 1000
9 0.82 0.79 0.80 1000
accuracy 0.70 10000
macro avg 0.73 0.70 0.71 10000
weighted avg 0.73 0.70 0.71 10000
Accuracy Score: 0.7008
Root Mean Square Error: 2.1608100332976985
plot_confusion_matrix(y_test,pred07_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred07[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.00% | 0.00% | 0.04% | 84.82% | 0.02% | 2.12% | 12.58% | 0.01% | 0.41% | 0.00% |
| 1 | 6.45% | 68.79% | 0.06% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 24.37% | 0.33% |
| 2 | 6.13% | 6.65% | 2.07% | 31.19% | 0.86% | 1.43% | 0.72% | 2.01% | 48.12% | 0.82% |
| 3 | 74.27% | 1.52% | 1.43% | 0.17% | 0.13% | 0.01% | 0.72% | 0.02% | 21.48% | 0.25% |
| 4 | 0.00% | 0.00% | 3.06% | 21.39% | 3.61% | 0.83% | 71.11% | 0.00% | 0.00% | 0.00% |
| 5 | 0.01% | 0.01% | 0.47% | 3.63% | 0.50% | 3.97% | 91.15% | 0.22% | 0.01% | 0.03% |
| 6 | 0.32% | 94.77% | 0.11% | 0.14% | 0.00% | 0.66% | 0.50% | 0.00% | 0.01% | 3.47% |
| 7 | 0.10% | 0.00% | 2.16% | 4.61% | 18.78% | 0.56% | 73.66% | 0.10% | 0.02% | 0.02% |
| 8 | 0.05% | 0.00% | 0.23% | 96.94% | 0.13% | 1.70% | 0.23% | 0.71% | 0.00% | 0.00% |
| 9 | 0.28% | 93.06% | 0.58% | 0.11% | 0.00% | 0.01% | 0.02% | 0.00% | 0.20% | 5.71% |
| 10 | 5.74% | 0.01% | 3.90% | 33.00% | 51.30% | 3.32% | 0.03% | 2.57% | 0.10% | 0.03% |
| 11 | 0.00% | 0.53% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.47% |
| 12 | 0.03% | 0.01% | 44.11% | 11.27% | 3.79% | 37.68% | 1.62% | 1.45% | 0.02% | 0.01% |
| 13 | 0.03% | 0.00% | 0.78% | 1.69% | 3.83% | 1.21% | 0.02% | 92.43% | 0.00% | 0.01% |
| 14 | 0.00% | 0.03% | 0.00% | 0.01% | 0.00% | 0.00% | 0.00% | 0.00% | 0.01% | 99.95% |
| 15 | 0.04% | 0.01% | 0.26% | 1.50% | 0.12% | 0.02% | 66.20% | 0.00% | 31.85% | 0.01% |
| 16 | 0.02% | 0.01% | 0.23% | 16.94% | 0.01% | 82.48% | 0.03% | 0.27% | 0.00% | 0.02% |
| 17 | 0.43% | 0.02% | 6.93% | 13.90% | 0.79% | 11.21% | 5.65% | 60.80% | 0.02% | 0.25% |
| 18 | 0.34% | 0.62% | 0.00% | 0.03% | 0.00% | 0.00% | 0.01% | 0.00% | 98.12% | 0.87% |
| 19 | 0.00% | 0.00% | 0.13% | 0.20% | 0.46% | 0.04% | 99.17% | 0.00% | 0.00% | 0.00% |
layer_names = []
for layer in model_07.layers:
layer_names.append(layer.name)
layer_names
['conv2d', 'max_pooling2d', 'conv2d_1', 'max_pooling2d_1', 'flatten', 'dense', 'batch_normalization', 'dense_1']
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Extracts the outputs of the top 8 layers:
layer_outputs_07 = [layer.output for layer in model_07.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model_07 = tf.keras.models.Model(inputs=model_07.input, outputs=layer_outputs_07)
# Get activation values for the last dense layer
# activations_07 = activation_model_07.predict(x_valid_norm[:3250])
activations_07 = activation_model_07.predict(x_valid_norm[:1200])
dense_layer_activations_07 = activations_07[-3]
output_layer_activations_07 = activations_07[-1]
38/38 [==============================] - 0s 2ms/step
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_07 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_07 = tsne_07.fit_transform(dense_layer_activations_07)
# Scaling
tsne_results_07 = (tsne_results_07 - tsne_results_07.min()) / (tsne_results_07.max() - tsne_results_07.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1200 samples in 0.000s... [t-SNE] Computed neighbors for 1200 samples in 0.022s... [t-SNE] Computed conditional probabilities for sample 1000 / 1200 [t-SNE] Computed conditional probabilities for sample 1200 / 1200 [t-SNE] Mean sigma: 1.826008 [t-SNE] KL divergence after 250 iterations with early exaggeration: 65.998062 [t-SNE] KL divergence after 300 iterations: 1.695120
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_07[:,0],tsne_results_07[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_07[:,0],tsne_results_07[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_07):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 8¶
- CNN with 3 layers/max pooling layers
- L2 Regularization(0.001)
Build CNN Model¶
k.clear_session()
model_08 = Sequential([
Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
MaxPool2D((2, 2),strides=2),
# Dropout(0.3),
Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
# Dropout(0.3),
Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
# Dropout(0.3),
Flatten(),
Dense(units=384,activation=tf.nn.relu),
# Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
BatchNormalization(),
# Dropout(0.3),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment8"] = {}
results["Experiment8"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • L2 Regularization(0.001)"
model_08.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 128) 3584
max_pooling2d (MaxPooling2 (None, 15, 15, 128) 0
D)
conv2d_1 (Conv2D) (None, 13, 13, 256) 295168
max_pooling2d_1 (MaxPoolin (None, 6, 6, 256) 0
g2D)
conv2d_2 (Conv2D) (None, 4, 4, 512) 1180160
max_pooling2d_2 (MaxPoolin (None, 2, 2, 512) 0
g2D)
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 384) 786816
batch_normalization (Batch (None, 384) 1536
Normalization)
dense_1 (Dense) (None, 10) 3850
=================================================================
Total params: 2271114 (8.66 MB)
Trainable params: 2270346 (8.66 MB)
Non-trainable params: 768 (3.00 KB)
_________________________________________________________________
keras.utils.plot_model(model_08, "CIFAR10_EXP_08.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_08.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_08 = model_08.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_08_3CNN_BN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment8"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200 704/704 [==============================] - 5s 5ms/step - loss: 1.2948 - accuracy: 0.5441 - val_loss: 1.4855 - val_accuracy: 0.4982 Epoch 2/200 704/704 [==============================] - 3s 5ms/step - loss: 0.9491 - accuracy: 0.6704 - val_loss: 1.4875 - val_accuracy: 0.5378 Epoch 3/200 704/704 [==============================] - 3s 5ms/step - loss: 0.7888 - accuracy: 0.7269 - val_loss: 1.0235 - val_accuracy: 0.6504 Epoch 4/200 704/704 [==============================] - 3s 5ms/step - loss: 0.6651 - accuracy: 0.7691 - val_loss: 1.9600 - val_accuracy: 0.4948 Epoch 5/200 704/704 [==============================] - 3s 5ms/step - loss: 0.5654 - accuracy: 0.8034 - val_loss: 0.9799 - val_accuracy: 0.6866 Epoch 6/200 704/704 [==============================] - 3s 5ms/step - loss: 0.4630 - accuracy: 0.8384 - val_loss: 1.2166 - val_accuracy: 0.6458 Epoch 7/200 704/704 [==============================] - 3s 5ms/step - loss: 0.3849 - accuracy: 0.8667 - val_loss: 1.0119 - val_accuracy: 0.7018 Epoch 8/200 704/704 [==============================] - 3s 5ms/step - loss: 0.3091 - accuracy: 0.8905 - val_loss: 1.0915 - val_accuracy: 0.7100 Epoch 9/200 704/704 [==============================] - 3s 5ms/step - loss: 0.2489 - accuracy: 0.9128 - val_loss: 1.2036 - val_accuracy: 0.7000 Epoch 10/200 704/704 [==============================] - 3s 5ms/step - loss: 0.2037 - accuracy: 0.9289 - val_loss: 1.2918 - val_accuracy: 0.7012 Epoch 11/200 704/704 [==============================] - 3s 5ms/step - loss: 0.1702 - accuracy: 0.9392 - val_loss: 1.4823 - val_accuracy: 0.6802 Epoch 12/200 704/704 [==============================] - 3s 5ms/step - loss: 0.1521 - accuracy: 0.9467 - val_loss: 1.5638 - val_accuracy: 0.6856 Epoch 13/200 704/704 [==============================] - 3s 5ms/step - loss: 0.1391 - accuracy: 0.9512 - val_loss: 1.2930 - val_accuracy: 0.7238 Epoch 14/200 704/704 [==============================] - 3s 5ms/step - loss: 0.1171 - accuracy: 0.9600 - val_loss: 1.4168 - val_accuracy: 0.7076 Epoch 15/200 704/704 [==============================] - 3s 5ms/step - loss: 0.1124 - accuracy: 0.9609 - val_loss: 1.4324 - val_accuracy: 0.7166 Epoch 16/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0994 - accuracy: 0.9651 - val_loss: 1.5606 - val_accuracy: 0.6976 Epoch 17/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0875 - accuracy: 0.9692 - val_loss: 1.7153 - val_accuracy: 0.7072 Epoch 18/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0945 - accuracy: 0.9666 - val_loss: 1.3756 - val_accuracy: 0.7278 Epoch 19/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0806 - accuracy: 0.9713 - val_loss: 1.6530 - val_accuracy: 0.7210 Epoch 20/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0741 - accuracy: 0.9743 - val_loss: 1.4687 - val_accuracy: 0.7280 Epoch 21/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0622 - accuracy: 0.9786 - val_loss: 1.5286 - val_accuracy: 0.7176 Epoch 22/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0730 - accuracy: 0.9742 - val_loss: 1.9872 - val_accuracy: 0.6860 Epoch 23/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0677 - accuracy: 0.9770 - val_loss: 1.6474 - val_accuracy: 0.7192 Epoch 24/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0637 - accuracy: 0.9772 - val_loss: 1.7021 - val_accuracy: 0.7168 Epoch 25/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0557 - accuracy: 0.9808 - val_loss: 1.7471 - val_accuracy: 0.7108 Epoch 26/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0757 - accuracy: 0.9737 - val_loss: 1.8876 - val_accuracy: 0.7058 Epoch 27/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0381 - accuracy: 0.9872 - val_loss: 1.5911 - val_accuracy: 0.7140 Epoch 28/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0554 - accuracy: 0.9807 - val_loss: 1.7155 - val_accuracy: 0.7186 Epoch 29/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0556 - accuracy: 0.9807 - val_loss: 1.8313 - val_accuracy: 0.7168 Epoch 30/200 704/704 [==============================] - 3s 5ms/step - loss: 0.0469 - accuracy: 0.9839 - val_loss: 2.1171 - val_accuracy: 0.7006 Time taken to train Model: 101.77 seconds
train_loss = history_08.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_08.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_08.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_08.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_08 = tf.keras.models.load_model("A2_Exp_08_3CNN_BN.h5")
test_loss, test_accuracy = model_08.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment8"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment8"]["Test Loss"] = round(test_loss,3)
results["Experiment8"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment8"]["Train Loss"] = round(train_loss,3)
results["Experiment8"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment8"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.047, Training Accuracy: 0.984 Validation Loss: 2.117, Validation Accuracy: 0.701 Test Loss: 0.944, Test Accuracy: 0.699
pred08 = model_08.predict(x_test_norm)
print('shape of preds: ', pred08.shape)
313/313 [==============================] - 0s 1ms/step shape of preds: (10000, 10)
history_08_dict = history_08.history
history_08_df=pd.DataFrame(history_08_dict)
history_08_df.tail().round(3)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 25 | 0.076 | 0.974 | 1.888 | 0.706 |
| 26 | 0.038 | 0.987 | 1.591 | 0.714 |
| 27 | 0.055 | 0.981 | 1.715 | 0.719 |
| 28 | 0.056 | 0.981 | 1.831 | 0.717 |
| 29 | 0.047 | 0.984 | 2.117 | 0.701 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_08.history['accuracy'], history_08.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_08.history['loss'], history_08.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred08_cm=np.argmax(pred08, axis=1)
print_validation_report(y_test, pred08_cm)
Classification Report
precision recall f1-score support
0 0.71 0.69 0.70 1000
1 0.95 0.71 0.81 1000
2 0.65 0.65 0.65 1000
3 0.46 0.72 0.56 1000
4 0.75 0.62 0.68 1000
5 0.70 0.55 0.61 1000
6 0.86 0.75 0.80 1000
7 0.93 0.61 0.74 1000
8 0.56 0.96 0.71 1000
9 0.85 0.73 0.79 1000
accuracy 0.70 10000
macro avg 0.74 0.70 0.70 10000
weighted avg 0.74 0.70 0.70 10000
Accuracy Score: 0.6988
Root Mean Square Error: 2.30967530185522
plot_confusion_matrix(y_test,pred08_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred08[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.20% | 0.00% | 0.03% | 63.93% | 0.00% | 1.36% | 0.56% | 0.00% | 33.86% | 0.05% |
| 1 | 0.01% | 0.01% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.98% | 0.00% |
| 2 | 0.76% | 0.18% | 0.00% | 0.04% | 0.00% | 0.00% | 0.00% | 0.00% | 98.80% | 0.21% |
| 3 | 76.48% | 0.02% | 0.48% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 23.01% | 0.00% |
| 4 | 0.00% | 0.00% | 2.15% | 7.70% | 23.22% | 0.31% | 66.58% | 0.00% | 0.03% | 0.00% |
| 5 | 0.00% | 0.00% | 0.03% | 2.06% | 0.00% | 0.22% | 97.54% | 0.00% | 0.14% | 0.00% |
| 6 | 0.42% | 57.52% | 0.42% | 16.05% | 0.01% | 1.73% | 0.11% | 0.15% | 6.66% | 16.93% |
| 7 | 6.59% | 0.00% | 33.58% | 6.53% | 21.15% | 4.08% | 25.78% | 0.15% | 2.05% | 0.09% |
| 8 | 0.03% | 0.00% | 0.85% | 94.48% | 0.91% | 3.61% | 0.03% | 0.07% | 0.01% | 0.01% |
| 9 | 0.57% | 68.29% | 0.15% | 0.00% | 0.00% | 0.00% | 0.02% | 0.00% | 16.99% | 13.98% |
| 10 | 59.07% | 0.03% | 3.53% | 12.66% | 13.34% | 0.32% | 0.00% | 0.22% | 10.75% | 0.07% |
| 11 | 0.00% | 0.03% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.02% | 99.94% |
| 12 | 0.31% | 0.02% | 0.90% | 22.42% | 21.36% | 52.52% | 1.01% | 0.62% | 0.82% | 0.02% |
| 13 | 0.25% | 0.00% | 0.06% | 1.23% | 41.70% | 21.65% | 0.00% | 35.02% | 0.08% | 0.01% |
| 14 | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.01% | 99.99% |
| 15 | 0.61% | 0.00% | 0.01% | 0.09% | 0.01% | 0.00% | 0.39% | 0.00% | 98.88% | 0.00% |
| 16 | 0.01% | 0.02% | 0.02% | 3.06% | 0.02% | 95.45% | 0.01% | 1.36% | 0.01% | 0.05% |
| 17 | 5.14% | 0.57% | 2.95% | 64.18% | 0.69% | 3.42% | 0.10% | 19.65% | 0.23% | 3.08% |
| 18 | 0.01% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.99% | 0.00% |
| 19 | 0.02% | 0.01% | 0.09% | 1.68% | 0.18% | 0.62% | 97.39% | 0.00% | 0.00% | 0.00% |
layer_names = []
for layer in model_08.layers:
layer_names.append(layer.name)
layer_names
['conv2d', 'max_pooling2d', 'conv2d_1', 'max_pooling2d_1', 'conv2d_2', 'max_pooling2d_2', 'flatten', 'dense', 'batch_normalization', 'dense_1']
# Extracts the outputs of the top 11 layers:
layer_outputs_08 = [layer.output for layer in model_08.layers[:10]]
# Creates a model that will return these outputs, given the model input:
activation_model_08 = tf.keras.models.Model(inputs=model_08.input, outputs=layer_outputs_08)
# Get activation values for the last dense layer
# activations_08 = activation_model_08.predict(x_valid_norm[:3250])
activations_08 = activation_model_08.predict(x_valid_norm[:1200])
dense_layer_activations_08 = activations_08[-3]
output_layer_activations_08 = activations_08[-1]
38/38 [==============================] - 0s 1ms/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_08 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_08 = tsne_08.fit_transform(dense_layer_activations_08)
# Scaling
tsne_results_08 = (tsne_results_08 - tsne_results_08.min()) / (tsne_results_08.max() - tsne_results_08.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1200 samples in 0.001s... [t-SNE] Computed neighbors for 1200 samples in 0.021s... [t-SNE] Computed conditional probabilities for sample 1000 / 1200 [t-SNE] Computed conditional probabilities for sample 1200 / 1200 [t-SNE] Mean sigma: 2.350581 [t-SNE] KL divergence after 250 iterations with early exaggeration: 65.708244 [t-SNE] KL divergence after 300 iterations: 1.568058
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_08[:,0],tsne_results_08[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_08[:,0],tsne_results_08[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_08):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 9¶
- CNN with 3 layers/max pooling layers
- Dropout(0.3)
- L2 Regularization(0.001)
- Batch Normalization
Build CNN Model¶
k.clear_session()
model_09 = Sequential([
Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
MaxPool2D((2, 2),strides=2),
Dropout(0.3),
Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Dropout(0.3),
Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Dropout(0.3),
Flatten(),
# Dense(units=384,activation=tf.nn.relu),
Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
BatchNormalization(),
Dropout(0.3),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment9"] = {}
results["Experiment9"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • Dropout(0.3)\n • L2 Regularization(0.001)\n • Batch Normalization"
model_09.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 128) 3584
max_pooling2d (MaxPooling2 (None, 15, 15, 128) 0
D)
dropout (Dropout) (None, 15, 15, 128) 0
conv2d_1 (Conv2D) (None, 13, 13, 256) 295168
max_pooling2d_1 (MaxPoolin (None, 6, 6, 256) 0
g2D)
dropout_1 (Dropout) (None, 6, 6, 256) 0
conv2d_2 (Conv2D) (None, 4, 4, 512) 1180160
max_pooling2d_2 (MaxPoolin (None, 2, 2, 512) 0
g2D)
dropout_2 (Dropout) (None, 2, 2, 512) 0
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 384) 786816
batch_normalization (Batch (None, 384) 1536
Normalization)
dropout_3 (Dropout) (None, 384) 0
dense_1 (Dense) (None, 10) 3850
=================================================================
Total params: 2271114 (8.66 MB)
Trainable params: 2270346 (8.66 MB)
Non-trainable params: 768 (3.00 KB)
_________________________________________________________________
keras.utils.plot_model(model_09, "CIFAR10_EXP_09.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_09.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_09 = model_09.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_09_3CNN_DO_L2_BN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment9"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
2024-10-20 03:52:59.267326: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
704/704 [==============================] - 6s 6ms/step - loss: 1.9834 - accuracy: 0.4108 - val_loss: 1.6030 - val_accuracy: 0.4954 Epoch 2/200 704/704 [==============================] - 4s 5ms/step - loss: 1.4095 - accuracy: 0.5593 - val_loss: 1.4080 - val_accuracy: 0.5346 Epoch 3/200 704/704 [==============================] - 4s 5ms/step - loss: 1.2251 - accuracy: 0.6155 - val_loss: 1.2170 - val_accuracy: 0.6228 Epoch 4/200 704/704 [==============================] - 4s 5ms/step - loss: 1.1422 - accuracy: 0.6431 - val_loss: 1.1869 - val_accuracy: 0.6390 Epoch 5/200 704/704 [==============================] - 4s 5ms/step - loss: 1.0826 - accuracy: 0.6668 - val_loss: 1.0311 - val_accuracy: 0.6880 Epoch 6/200 704/704 [==============================] - 4s 5ms/step - loss: 1.0340 - accuracy: 0.6892 - val_loss: 1.0256 - val_accuracy: 0.6892 Epoch 7/200 704/704 [==============================] - 4s 5ms/step - loss: 1.0030 - accuracy: 0.7022 - val_loss: 0.9803 - val_accuracy: 0.7124 Epoch 8/200 704/704 [==============================] - 4s 5ms/step - loss: 0.9761 - accuracy: 0.7135 - val_loss: 1.0326 - val_accuracy: 0.6934 Epoch 9/200 704/704 [==============================] - 4s 5ms/step - loss: 0.9484 - accuracy: 0.7235 - val_loss: 0.9733 - val_accuracy: 0.7148 Epoch 10/200 704/704 [==============================] - 4s 5ms/step - loss: 0.9235 - accuracy: 0.7329 - val_loss: 0.9717 - val_accuracy: 0.7072 Epoch 11/200 704/704 [==============================] - 4s 5ms/step - loss: 0.9040 - accuracy: 0.7422 - val_loss: 0.9196 - val_accuracy: 0.7384 Epoch 12/200 704/704 [==============================] - 4s 5ms/step - loss: 0.8831 - accuracy: 0.7473 - val_loss: 0.9342 - val_accuracy: 0.7350 Epoch 13/200 704/704 [==============================] - 4s 5ms/step - loss: 0.8631 - accuracy: 0.7555 - val_loss: 0.8716 - val_accuracy: 0.7480 Epoch 14/200 704/704 [==============================] - 4s 5ms/step - loss: 0.8532 - accuracy: 0.7555 - val_loss: 0.8442 - val_accuracy: 0.7600 Epoch 15/200 704/704 [==============================] - 4s 5ms/step - loss: 0.8392 - accuracy: 0.7645 - val_loss: 0.8650 - val_accuracy: 0.7528 Epoch 16/200 704/704 [==============================] - 4s 5ms/step - loss: 0.8381 - accuracy: 0.7646 - val_loss: 0.8535 - val_accuracy: 0.7598 Epoch 17/200 704/704 [==============================] - 4s 5ms/step - loss: 0.8055 - accuracy: 0.7756 - val_loss: 0.8213 - val_accuracy: 0.7710 Epoch 18/200 704/704 [==============================] - 4s 5ms/step - loss: 0.8042 - accuracy: 0.7758 - val_loss: 0.8003 - val_accuracy: 0.7724 Epoch 19/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7806 - accuracy: 0.7852 - val_loss: 0.8582 - val_accuracy: 0.7572 Epoch 20/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7765 - accuracy: 0.7883 - val_loss: 0.8054 - val_accuracy: 0.7760 Epoch 21/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7622 - accuracy: 0.7917 - val_loss: 0.8034 - val_accuracy: 0.7746 Epoch 22/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7572 - accuracy: 0.7912 - val_loss: 0.7984 - val_accuracy: 0.7810 Epoch 23/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7379 - accuracy: 0.7970 - val_loss: 0.7904 - val_accuracy: 0.7820 Epoch 24/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7296 - accuracy: 0.8012 - val_loss: 0.8079 - val_accuracy: 0.7760 Epoch 25/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7285 - accuracy: 0.8029 - val_loss: 0.7988 - val_accuracy: 0.7784 Epoch 26/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7290 - accuracy: 0.8034 - val_loss: 0.7826 - val_accuracy: 0.7812 Epoch 27/200 704/704 [==============================] - 4s 5ms/step - loss: 0.7103 - accuracy: 0.8110 - val_loss: 0.8074 - val_accuracy: 0.7744 Epoch 28/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6960 - accuracy: 0.8140 - val_loss: 0.7937 - val_accuracy: 0.7830 Epoch 29/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6911 - accuracy: 0.8151 - val_loss: 0.7556 - val_accuracy: 0.7924 Epoch 30/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6853 - accuracy: 0.8169 - val_loss: 0.7653 - val_accuracy: 0.7864 Epoch 31/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6713 - accuracy: 0.8223 - val_loss: 0.7872 - val_accuracy: 0.7944 Epoch 32/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6748 - accuracy: 0.8196 - val_loss: 0.7797 - val_accuracy: 0.7904 Epoch 33/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6632 - accuracy: 0.8238 - val_loss: 0.7383 - val_accuracy: 0.8020 Epoch 34/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6618 - accuracy: 0.8259 - val_loss: 0.7901 - val_accuracy: 0.7892 Epoch 35/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6480 - accuracy: 0.8295 - val_loss: 0.7467 - val_accuracy: 0.7980 Epoch 36/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6414 - accuracy: 0.8299 - val_loss: 0.7515 - val_accuracy: 0.7978 Epoch 37/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6349 - accuracy: 0.8320 - val_loss: 0.7848 - val_accuracy: 0.7846 Epoch 38/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6236 - accuracy: 0.8372 - val_loss: 0.7697 - val_accuracy: 0.7898 Epoch 39/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6296 - accuracy: 0.8344 - val_loss: 0.7341 - val_accuracy: 0.8030 Epoch 40/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6211 - accuracy: 0.8378 - val_loss: 0.7625 - val_accuracy: 0.7950 Epoch 41/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6078 - accuracy: 0.8430 - val_loss: 0.7621 - val_accuracy: 0.7940 Epoch 42/200 704/704 [==============================] - 4s 5ms/step - loss: 0.6068 - accuracy: 0.8412 - val_loss: 0.7502 - val_accuracy: 0.7936 Epoch 43/200 704/704 [==============================] - 4s 5ms/step - loss: 0.5967 - accuracy: 0.8460 - val_loss: 0.7803 - val_accuracy: 0.7874 Epoch 44/200 704/704 [==============================] - 4s 5ms/step - loss: 0.5945 - accuracy: 0.8465 - val_loss: 0.7729 - val_accuracy: 0.7938 Epoch 45/200 704/704 [==============================] - 4s 5ms/step - loss: 0.5928 - accuracy: 0.8474 - val_loss: 0.7382 - val_accuracy: 0.8022 Epoch 46/200 704/704 [==============================] - 4s 5ms/step - loss: 0.5886 - accuracy: 0.8470 - val_loss: 0.7753 - val_accuracy: 0.7936 Epoch 47/200 704/704 [==============================] - 4s 5ms/step - loss: 0.5810 - accuracy: 0.8519 - val_loss: 0.7693 - val_accuracy: 0.7948 Epoch 48/200 704/704 [==============================] - 4s 5ms/step - loss: 0.5785 - accuracy: 0.8509 - val_loss: 0.7534 - val_accuracy: 0.7968 Epoch 49/200 704/704 [==============================] - 4s 5ms/step - loss: 0.5736 - accuracy: 0.8533 - val_loss: 0.7459 - val_accuracy: 0.7954 Time taken to train Model: 185.30 seconds
train_loss = history_09.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_09.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_09.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_09.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_09 = tf.keras.models.load_model("A2_Exp_09_3CNN_DO_L2_BN.h5")
test_loss, test_accuracy = model_09.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment9"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment9"]["Test Loss"] = round(test_loss,3)
results["Experiment9"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment9"]["Train Loss"] = round(train_loss,3)
results["Experiment9"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment9"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.574, Training Accuracy: 0.853 Validation Loss: 0.746, Validation Accuracy: 0.795 Test Loss: 0.743, Test Accuracy: 0.798
pred09 = model_09.predict(x_test_norm)
print('shape of preds: ', pred09.shape)
313/313 [==============================] - 0s 1ms/step shape of preds: (10000, 10)
history_09_dict = history_09.history
history_09_df=pd.DataFrame(history_09_dict)
history_09_df.tail().round(3)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 44 | 0.593 | 0.847 | 0.738 | 0.802 |
| 45 | 0.589 | 0.847 | 0.775 | 0.794 |
| 46 | 0.581 | 0.852 | 0.769 | 0.795 |
| 47 | 0.578 | 0.851 | 0.753 | 0.797 |
| 48 | 0.574 | 0.853 | 0.746 | 0.795 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_09.history['accuracy'], history_09.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_09.history['loss'], history_09.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred09_cm=np.argmax(pred09, axis=1)
print_validation_report(y_test, pred09_cm)
Classification Report
precision recall f1-score support
0 0.83 0.80 0.81 1000
1 0.92 0.89 0.90 1000
2 0.80 0.62 0.70 1000
3 0.66 0.65 0.65 1000
4 0.72 0.82 0.77 1000
5 0.68 0.75 0.71 1000
6 0.81 0.89 0.85 1000
7 0.85 0.81 0.83 1000
8 0.87 0.88 0.88 1000
9 0.85 0.86 0.86 1000
accuracy 0.80 10000
macro avg 0.80 0.80 0.80 10000
weighted avg 0.80 0.80 0.80 10000
Accuracy Score: 0.7976
Root Mean Square Error: 1.8043835512440252
plot_confusion_matrix(y_test,pred09_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred09[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.08% | 0.01% | 0.44% | 89.39% | 0.68% | 8.50% | 0.60% | 0.28% | 0.01% | 0.00% |
| 1 | 0.11% | 0.52% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.36% | 0.01% |
| 2 | 9.73% | 36.61% | 0.33% | 1.01% | 0.45% | 0.37% | 0.32% | 0.42% | 42.69% | 8.08% |
| 3 | 88.47% | 2.43% | 4.12% | 0.09% | 1.06% | 0.02% | 0.30% | 0.03% | 3.43% | 0.05% |
| 4 | 0.00% | 0.00% | 0.08% | 0.06% | 0.64% | 0.00% | 99.22% | 0.00% | 0.00% | 0.00% |
| 5 | 0.00% | 0.00% | 0.03% | 0.53% | 0.06% | 0.27% | 99.11% | 0.00% | 0.01% | 0.00% |
| 6 | 0.01% | 94.05% | 0.01% | 0.08% | 0.00% | 0.24% | 0.15% | 0.01% | 0.00% | 5.45% |
| 7 | 0.03% | 0.00% | 1.54% | 0.47% | 1.01% | 0.15% | 96.76% | 0.02% | 0.01% | 0.00% |
| 8 | 0.01% | 0.00% | 0.09% | 98.11% | 0.24% | 1.48% | 0.04% | 0.03% | 0.00% | 0.00% |
| 9 | 0.29% | 65.95% | 0.01% | 0.02% | 0.01% | 0.02% | 0.35% | 0.00% | 0.39% | 32.96% |
| 10 | 17.56% | 0.12% | 2.42% | 26.70% | 3.42% | 23.74% | 0.16% | 4.38% | 21.17% | 0.33% |
| 11 | 0.00% | 0.01% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.99% |
| 12 | 0.07% | 0.01% | 3.03% | 19.53% | 12.70% | 62.71% | 0.30% | 1.63% | 0.01% | 0.01% |
| 13 | 0.00% | 0.00% | 0.00% | 0.00% | 0.05% | 0.02% | 0.00% | 99.93% | 0.00% | 0.00% |
| 14 | 0.00% | 0.01% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.01% | 99.98% |
| 15 | 3.22% | 0.24% | 0.30% | 0.36% | 0.06% | 0.02% | 12.08% | 0.00% | 83.71% | 0.00% |
| 16 | 0.02% | 0.01% | 0.25% | 4.02% | 0.04% | 95.31% | 0.12% | 0.21% | 0.01% | 0.01% |
| 17 | 0.15% | 0.01% | 2.14% | 19.37% | 3.09% | 15.53% | 0.63% | 58.78% | 0.11% | 0.19% |
| 18 | 0.19% | 0.07% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.51% | 0.22% |
| 19 | 0.00% | 0.00% | 0.01% | 0.11% | 0.03% | 0.01% | 99.83% | 0.00% | 0.00% | 0.00% |
layer_names = []
for layer in model_09.layers:
layer_names.append(layer.name)
layer_names
['conv2d', 'max_pooling2d', 'dropout', 'conv2d_1', 'max_pooling2d_1', 'dropout_1', 'conv2d_2', 'max_pooling2d_2', 'dropout_2', 'flatten', 'dense', 'batch_normalization', 'dropout_3', 'dense_1']
# Extracts the outputs of the top 11 layers:
layer_outputs_09 = [layer.output for layer in model_09.layers[:14]]
# Creates a model that will return these outputs, given the model input:
activation_model_09 = tf.keras.models.Model(inputs=model_09.input, outputs=layer_outputs_09)
# Get activation values for the last dense layer
# activations_09 = activation_model_09.predict(x_valid_norm[:3250])
activations_09 = activation_model_09.predict(x_valid_norm[:1200])
dense_layer_activations_09 = activations_09[-4]
output_layer_activations_09 = activations_09[-1]
38/38 [==============================] - 0s 1ms/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_09 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_09 = tsne_09.fit_transform(dense_layer_activations_09)
# Scaling
tsne_results_09 = (tsne_results_09 - tsne_results_09.min()) / (tsne_results_09.max() - tsne_results_09.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1200 samples in 0.000s... [t-SNE] Computed neighbors for 1200 samples in 0.021s... [t-SNE] Computed conditional probabilities for sample 1000 / 1200 [t-SNE] Computed conditional probabilities for sample 1200 / 1200 [t-SNE] Mean sigma: 2.847355 [t-SNE] KL divergence after 250 iterations with early exaggeration: 62.816189 [t-SNE] KL divergence after 300 iterations: 1.159818
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_09[:,0],tsne_results_09[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_09[:,0],tsne_results_09[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_09):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 10¶
- CNN with 3 layers/max pooling layers
- 2 Fully-Connected Hidden Layers
- Dropout(0.3)
- L2 Regularization(0.001)
- Batch Normalization
Build CNN Model¶
k.clear_session()
model_10 = Sequential([
Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
MaxPool2D((2, 2),strides=2),
Dropout(0.3),
Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Dropout(0.3),
Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Dropout(0.3),
Flatten(),
Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
BatchNormalization(),
Dropout(0.3),
Dense(units=768,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
BatchNormalization(),
Dropout(0.3),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment10"] = {}
results["Experiment10"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • 2 Fully-Connected Hidden Layers\n • Dropout(0.3)\n • L2 Regularization(0.001)\n • Batch Normalization"
model_10.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 128) 3584
max_pooling2d (MaxPooling2 (None, 15, 15, 128) 0
D)
dropout (Dropout) (None, 15, 15, 128) 0
conv2d_1 (Conv2D) (None, 13, 13, 256) 295168
max_pooling2d_1 (MaxPoolin (None, 6, 6, 256) 0
g2D)
dropout_1 (Dropout) (None, 6, 6, 256) 0
conv2d_2 (Conv2D) (None, 4, 4, 512) 1180160
max_pooling2d_2 (MaxPoolin (None, 2, 2, 512) 0
g2D)
dropout_2 (Dropout) (None, 2, 2, 512) 0
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 384) 786816
batch_normalization (Batch (None, 384) 1536
Normalization)
dropout_3 (Dropout) (None, 384) 0
dense_1 (Dense) (None, 768) 295680
batch_normalization_1 (Bat (None, 768) 3072
chNormalization)
dropout_4 (Dropout) (None, 768) 0
dense_2 (Dense) (None, 10) 7690
=================================================================
Total params: 2573706 (9.82 MB)
Trainable params: 2571402 (9.81 MB)
Non-trainable params: 2304 (9.00 KB)
_________________________________________________________________
keras.utils.plot_model(model_10, "CIFAR10_EXP_10.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_10.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_10 = model_10.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_10_3CNN_2DNN_DO_L2_BN.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment10"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
2024-10-20 03:56:09.179894: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
704/704 [==============================] - 6s 6ms/step - loss: 2.7633 - accuracy: 0.3147 - val_loss: 2.1523 - val_accuracy: 0.4300 Epoch 2/200 704/704 [==============================] - 4s 6ms/step - loss: 1.8389 - accuracy: 0.5035 - val_loss: 1.7617 - val_accuracy: 0.4958 Epoch 3/200 704/704 [==============================] - 4s 6ms/step - loss: 1.5069 - accuracy: 0.5783 - val_loss: 1.5316 - val_accuracy: 0.5566 Epoch 4/200 704/704 [==============================] - 4s 6ms/step - loss: 1.3299 - accuracy: 0.6196 - val_loss: 1.2661 - val_accuracy: 0.6322 Epoch 5/200 704/704 [==============================] - 4s 6ms/step - loss: 1.2306 - accuracy: 0.6470 - val_loss: 1.2873 - val_accuracy: 0.6304 Epoch 6/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1748 - accuracy: 0.6653 - val_loss: 1.1088 - val_accuracy: 0.6808 Epoch 7/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1309 - accuracy: 0.6852 - val_loss: 1.1560 - val_accuracy: 0.6680 Epoch 8/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0975 - accuracy: 0.6983 - val_loss: 1.0114 - val_accuracy: 0.7256 Epoch 9/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0579 - accuracy: 0.7113 - val_loss: 1.0700 - val_accuracy: 0.7078 Epoch 10/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0330 - accuracy: 0.7187 - val_loss: 1.0668 - val_accuracy: 0.7136 Epoch 11/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0067 - accuracy: 0.7299 - val_loss: 1.0101 - val_accuracy: 0.7212 Epoch 12/200 704/704 [==============================] - 4s 6ms/step - loss: 0.9810 - accuracy: 0.7371 - val_loss: 0.9827 - val_accuracy: 0.7388 Epoch 13/200 704/704 [==============================] - 4s 6ms/step - loss: 0.9591 - accuracy: 0.7451 - val_loss: 1.0030 - val_accuracy: 0.7258 Epoch 14/200 704/704 [==============================] - 4s 6ms/step - loss: 0.9346 - accuracy: 0.7504 - val_loss: 1.0178 - val_accuracy: 0.7300 Epoch 15/200 704/704 [==============================] - 4s 6ms/step - loss: 0.9194 - accuracy: 0.7566 - val_loss: 0.9340 - val_accuracy: 0.7514 Epoch 16/200 704/704 [==============================] - 4s 6ms/step - loss: 0.9052 - accuracy: 0.7650 - val_loss: 0.9278 - val_accuracy: 0.7560 Epoch 17/200 704/704 [==============================] - 4s 6ms/step - loss: 0.8801 - accuracy: 0.7676 - val_loss: 0.8870 - val_accuracy: 0.7628 Epoch 18/200 704/704 [==============================] - 4s 6ms/step - loss: 0.8623 - accuracy: 0.7753 - val_loss: 0.9299 - val_accuracy: 0.7494 Epoch 19/200 704/704 [==============================] - 4s 6ms/step - loss: 0.8460 - accuracy: 0.7793 - val_loss: 0.9020 - val_accuracy: 0.7704 Epoch 20/200 704/704 [==============================] - 4s 6ms/step - loss: 0.8355 - accuracy: 0.7825 - val_loss: 0.8924 - val_accuracy: 0.7660 Epoch 21/200 704/704 [==============================] - 4s 6ms/step - loss: 0.8195 - accuracy: 0.7860 - val_loss: 0.8450 - val_accuracy: 0.7764 Epoch 22/200 704/704 [==============================] - 4s 6ms/step - loss: 0.8051 - accuracy: 0.7947 - val_loss: 0.8561 - val_accuracy: 0.7728 Epoch 23/200 704/704 [==============================] - 4s 6ms/step - loss: 0.7915 - accuracy: 0.7967 - val_loss: 0.8747 - val_accuracy: 0.7710 Epoch 24/200 704/704 [==============================] - 4s 6ms/step - loss: 0.7793 - accuracy: 0.7978 - val_loss: 0.8515 - val_accuracy: 0.7758 Epoch 25/200 704/704 [==============================] - 4s 6ms/step - loss: 0.7696 - accuracy: 0.8022 - val_loss: 0.8211 - val_accuracy: 0.7830 Epoch 26/200 704/704 [==============================] - 4s 6ms/step - loss: 0.7538 - accuracy: 0.8065 - val_loss: 0.8288 - val_accuracy: 0.7812 Epoch 27/200 704/704 [==============================] - 4s 6ms/step - loss: 0.7461 - accuracy: 0.8083 - val_loss: 0.8537 - val_accuracy: 0.7756 Epoch 28/200 704/704 [==============================] - 4s 6ms/step - loss: 0.7250 - accuracy: 0.8139 - val_loss: 0.7816 - val_accuracy: 0.7916 Epoch 29/200 704/704 [==============================] - 4s 6ms/step - loss: 0.7160 - accuracy: 0.8167 - val_loss: 0.8113 - val_accuracy: 0.7898 Epoch 30/200 704/704 [==============================] - 4s 6ms/step - loss: 0.7081 - accuracy: 0.8178 - val_loss: 0.8132 - val_accuracy: 0.7848 Epoch 31/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6997 - accuracy: 0.8208 - val_loss: 0.8011 - val_accuracy: 0.7894 Epoch 32/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6952 - accuracy: 0.8225 - val_loss: 0.8451 - val_accuracy: 0.7762 Epoch 33/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6830 - accuracy: 0.8251 - val_loss: 0.8150 - val_accuracy: 0.7836 Epoch 34/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6735 - accuracy: 0.8310 - val_loss: 0.8077 - val_accuracy: 0.7852 Epoch 35/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6721 - accuracy: 0.8302 - val_loss: 0.7888 - val_accuracy: 0.7964 Epoch 36/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6589 - accuracy: 0.8320 - val_loss: 0.8001 - val_accuracy: 0.7850 Epoch 37/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6566 - accuracy: 0.8324 - val_loss: 0.7991 - val_accuracy: 0.7852 Epoch 38/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6427 - accuracy: 0.8350 - val_loss: 0.8206 - val_accuracy: 0.7774 Epoch 39/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6308 - accuracy: 0.8388 - val_loss: 0.8357 - val_accuracy: 0.7710 Epoch 40/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6283 - accuracy: 0.8413 - val_loss: 0.7936 - val_accuracy: 0.7906 Epoch 41/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6147 - accuracy: 0.8427 - val_loss: 0.8122 - val_accuracy: 0.7826 Epoch 42/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6112 - accuracy: 0.8471 - val_loss: 0.7681 - val_accuracy: 0.7966 Epoch 43/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6044 - accuracy: 0.8492 - val_loss: 0.8230 - val_accuracy: 0.7796 Epoch 44/200 704/704 [==============================] - 4s 6ms/step - loss: 0.6008 - accuracy: 0.8502 - val_loss: 0.7868 - val_accuracy: 0.7926 Epoch 45/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5936 - accuracy: 0.8493 - val_loss: 0.7622 - val_accuracy: 0.7996 Epoch 46/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5947 - accuracy: 0.8535 - val_loss: 0.7753 - val_accuracy: 0.7900 Epoch 47/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5849 - accuracy: 0.8530 - val_loss: 0.7844 - val_accuracy: 0.7940 Epoch 48/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5829 - accuracy: 0.8546 - val_loss: 0.7776 - val_accuracy: 0.7958 Epoch 49/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5693 - accuracy: 0.8597 - val_loss: 0.7883 - val_accuracy: 0.7938 Epoch 50/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5698 - accuracy: 0.8593 - val_loss: 0.7657 - val_accuracy: 0.7982 Epoch 51/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5607 - accuracy: 0.8606 - val_loss: 0.7836 - val_accuracy: 0.7962 Epoch 52/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5676 - accuracy: 0.8578 - val_loss: 0.7648 - val_accuracy: 0.7992 Epoch 53/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5586 - accuracy: 0.8605 - val_loss: 0.7606 - val_accuracy: 0.8012 Epoch 54/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5529 - accuracy: 0.8636 - val_loss: 0.7709 - val_accuracy: 0.7964 Epoch 55/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5404 - accuracy: 0.8662 - val_loss: 0.8297 - val_accuracy: 0.7866 Epoch 56/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5448 - accuracy: 0.8660 - val_loss: 0.7625 - val_accuracy: 0.7984 Epoch 57/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5388 - accuracy: 0.8664 - val_loss: 0.7929 - val_accuracy: 0.7916 Epoch 58/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5343 - accuracy: 0.8688 - val_loss: 0.7625 - val_accuracy: 0.7998 Epoch 59/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5337 - accuracy: 0.8676 - val_loss: 0.7665 - val_accuracy: 0.8004 Epoch 60/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5294 - accuracy: 0.8714 - val_loss: 0.7494 - val_accuracy: 0.8118 Epoch 61/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5212 - accuracy: 0.8726 - val_loss: 0.7581 - val_accuracy: 0.8046 Epoch 62/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5144 - accuracy: 0.8732 - val_loss: 0.7583 - val_accuracy: 0.8060 Epoch 63/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5099 - accuracy: 0.8777 - val_loss: 0.7608 - val_accuracy: 0.8020 Epoch 64/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5093 - accuracy: 0.8747 - val_loss: 0.7710 - val_accuracy: 0.7996 Epoch 65/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5118 - accuracy: 0.8745 - val_loss: 0.7451 - val_accuracy: 0.8062 Epoch 66/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5054 - accuracy: 0.8764 - val_loss: 0.7839 - val_accuracy: 0.7998 Epoch 67/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5009 - accuracy: 0.8775 - val_loss: 0.7768 - val_accuracy: 0.8022 Epoch 68/200 704/704 [==============================] - 4s 6ms/step - loss: 0.5001 - accuracy: 0.8789 - val_loss: 0.7661 - val_accuracy: 0.7988 Epoch 69/200 704/704 [==============================] - 4s 6ms/step - loss: 0.4910 - accuracy: 0.8816 - val_loss: 0.7415 - val_accuracy: 0.8084 Epoch 70/200 704/704 [==============================] - 4s 6ms/step - loss: 0.4900 - accuracy: 0.8805 - val_loss: 0.7718 - val_accuracy: 0.8044 Time taken to train Model: 284.72 seconds
train_loss = history_10.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_10.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_10.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_10.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_10 = tf.keras.models.load_model("A2_Exp_10_3CNN_2DNN_DO_L2_BN.h5")
test_loss, test_accuracy = model_10.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment10"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment10"]["Test Loss"] = round(test_loss,3)
results["Experiment10"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment10"]["Train Loss"] = round(train_loss,3)
results["Experiment10"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment10"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.490, Training Accuracy: 0.880 Validation Loss: 0.772, Validation Accuracy: 0.804 Test Loss: 0.741, Test Accuracy: 0.811
pred10 = model_10.predict(x_test_norm)
print('shape of preds: ', pred10.shape)
313/313 [==============================] - 0s 1ms/step shape of preds: (10000, 10)
history_10_dict = history_10.history
history_10_df=pd.DataFrame(history_10_dict)
history_10_df.tail().round(3)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 65 | 0.505 | 0.876 | 0.784 | 0.800 |
| 66 | 0.501 | 0.877 | 0.777 | 0.802 |
| 67 | 0.500 | 0.879 | 0.766 | 0.799 |
| 68 | 0.491 | 0.882 | 0.741 | 0.808 |
| 69 | 0.490 | 0.880 | 0.772 | 0.804 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_10.history['accuracy'], history_10.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_10.history['loss'], history_10.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred10_cm=np.argmax(pred10, axis=1)
print_validation_report(y_test, pred10_cm)
Classification Report
precision recall f1-score support
0 0.83 0.84 0.84 1000
1 0.92 0.90 0.91 1000
2 0.80 0.67 0.73 1000
3 0.68 0.65 0.66 1000
4 0.72 0.83 0.77 1000
5 0.76 0.73 0.75 1000
6 0.83 0.89 0.86 1000
7 0.85 0.82 0.83 1000
8 0.91 0.86 0.89 1000
9 0.83 0.91 0.86 1000
accuracy 0.81 10000
macro avg 0.81 0.81 0.81 10000
weighted avg 0.81 0.81 0.81 10000
Accuracy Score: 0.8111
Root Mean Square Error: 1.7584652399180372
plot_confusion_matrix(y_test,pred10_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred10[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.02% | 0.00% | 0.07% | 94.90% | 0.03% | 4.72% | 0.17% | 0.05% | 0.03% | 0.00% |
| 1 | 9.96% | 20.72% | 0.03% | 0.02% | 0.03% | 0.00% | 0.01% | 0.00% | 68.96% | 0.27% |
| 2 | 0.97% | 0.67% | 0.06% | 0.03% | 0.05% | 0.02% | 0.02% | 0.01% | 98.08% | 0.09% |
| 3 | 97.95% | 0.05% | 0.24% | 0.19% | 0.06% | 0.01% | 0.01% | 0.00% | 1.32% | 0.16% |
| 4 | 0.00% | 0.00% | 0.02% | 0.04% | 0.24% | 0.01% | 99.68% | 0.00% | 0.00% | 0.00% |
| 5 | 0.00% | 0.00% | 0.01% | 0.38% | 0.05% | 0.04% | 99.50% | 0.00% | 0.01% | 0.00% |
| 6 | 0.01% | 99.62% | 0.01% | 0.03% | 0.00% | 0.01% | 0.00% | 0.01% | 0.00% | 0.31% |
| 7 | 0.06% | 0.01% | 0.21% | 0.56% | 1.07% | 0.14% | 97.89% | 0.01% | 0.03% | 0.02% |
| 8 | 0.01% | 0.00% | 0.07% | 98.07% | 0.53% | 1.05% | 0.13% | 0.13% | 0.01% | 0.01% |
| 9 | 0.06% | 52.34% | 0.02% | 0.04% | 0.02% | 0.01% | 0.05% | 0.00% | 0.36% | 47.12% |
| 10 | 27.80% | 0.01% | 1.40% | 18.56% | 9.80% | 21.10% | 0.08% | 20.49% | 0.20% | 0.56% |
| 11 | 0.00% | 0.01% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.99% |
| 12 | 0.01% | 0.02% | 0.33% | 7.20% | 11.28% | 80.29% | 0.54% | 0.22% | 0.02% | 0.08% |
| 13 | 0.00% | 0.00% | 0.00% | 0.01% | 0.00% | 0.02% | 0.00% | 99.97% | 0.00% | 0.00% |
| 14 | 0.00% | 0.01% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 99.99% |
| 15 | 3.50% | 0.26% | 0.06% | 0.76% | 2.27% | 0.03% | 8.43% | 0.02% | 84.60% | 0.07% |
| 16 | 0.00% | 0.00% | 0.06% | 1.52% | 0.01% | 98.34% | 0.03% | 0.02% | 0.00% | 0.00% |
| 17 | 0.01% | 0.05% | 0.37% | 9.61% | 1.25% | 34.08% | 0.14% | 54.07% | 0.02% | 0.40% |
| 18 | 1.81% | 0.89% | 0.01% | 0.02% | 0.02% | 0.00% | 0.01% | 0.00% | 96.05% | 1.19% |
| 19 | 0.00% | 0.00% | 0.01% | 0.01% | 0.00% | 0.00% | 99.98% | 0.00% | 0.00% | 0.00% |
layer_names = []
for layer in model_10.layers:
layer_names.append(layer.name)
layer_names
['conv2d', 'max_pooling2d', 'dropout', 'conv2d_1', 'max_pooling2d_1', 'dropout_1', 'conv2d_2', 'max_pooling2d_2', 'dropout_2', 'flatten', 'dense', 'batch_normalization', 'dropout_3', 'dense_1', 'batch_normalization_1', 'dropout_4', 'dense_2']
# Extracts the outputs of the top 11 layers:
layer_outputs_10 = [layer.output for layer in model_10.layers[:14]]
# Creates a model that will return these outputs, given the model input:
activation_model_10 = tf.keras.models.Model(inputs=model_10.input, outputs=layer_outputs_10)
# Get activation values for the last dense layer
# activations_10 = activation_model_10.predict(x_valid_norm[:3250])
activations_10 = activation_model_10.predict(x_valid_norm[:1200])
dense_layer_activations_10 = activations_10[-4]
output_layer_activations_10 = activations_10[-1]
38/38 [==============================] - 0s 1ms/step
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_10 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_10 = tsne_10.fit_transform(dense_layer_activations_10)
# Scaling
tsne_results_10 = (tsne_results_10 - tsne_results_10.min()) / (tsne_results_10.max() - tsne_results_10.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1200 samples in 0.000s... [t-SNE] Computed neighbors for 1200 samples in 0.021s... [t-SNE] Computed conditional probabilities for sample 1000 / 1200 [t-SNE] Computed conditional probabilities for sample 1200 / 1200 [t-SNE] Mean sigma: 6.276072 [t-SNE] KL divergence after 250 iterations with early exaggeration: 61.099758 [t-SNE] KL divergence after 300 iterations: 0.962577
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_10[:,0],tsne_results_10[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_10[:,0],tsne_results_10[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_10):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Experiment 11 - TWEAK Hyperparameters¶
- CNN with 3 layers/max pooling layers
- 2 Fully-Connected Hidden Layers (384, 768)
- Dropout(variable)
- L2 Regularization(variable)
- Batch Normalization
l2_rate = 0.001
dropout_rate = 0.5
Build CNN Model¶
k.clear_session()
model_11 = Sequential([
Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
MaxPool2D((2, 2),strides=2),
Dropout(dropout_rate),
Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Dropout(dropout_rate),
Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
MaxPool2D((2, 2),strides=2),
Dropout(dropout_rate),
Flatten(),
Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(l2_rate)),
BatchNormalization(),
Dropout(dropout_rate),
Dense(units=768,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(l2_rate)),
BatchNormalization(),
Dropout(dropout_rate),
Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment11"] = {}
results["Experiment11"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • 2 Fully-Connected Hidden Layers\n • Dropout(variable)\n • L2 Regularization(0.001)\n • Batch Normalization"
model_11.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 128) 3584
max_pooling2d (MaxPooling2 (None, 15, 15, 128) 0
D)
dropout (Dropout) (None, 15, 15, 128) 0
conv2d_1 (Conv2D) (None, 13, 13, 256) 295168
max_pooling2d_1 (MaxPoolin (None, 6, 6, 256) 0
g2D)
dropout_1 (Dropout) (None, 6, 6, 256) 0
conv2d_2 (Conv2D) (None, 4, 4, 512) 1180160
max_pooling2d_2 (MaxPoolin (None, 2, 2, 512) 0
g2D)
dropout_2 (Dropout) (None, 2, 2, 512) 0
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 384) 786816
batch_normalization (Batch (None, 384) 1536
Normalization)
dropout_3 (Dropout) (None, 384) 0
dense_1 (Dense) (None, 768) 295680
batch_normalization_1 (Bat (None, 768) 3072
chNormalization)
dropout_4 (Dropout) (None, 768) 0
dense_2 (Dense) (None, 10) 7690
=================================================================
Total params: 2573706 (9.82 MB)
Trainable params: 2571402 (9.81 MB)
Non-trainable params: 2304 (9.00 KB)
_________________________________________________________________
keras.utils.plot_model(model_11, "CIFAR10_EXP_11_TWEAK.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
model_11.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
Model Train¶
# Start time
start_time = time.time()
history_11 = model_11.fit(x_train_norm
,y_train_split
,epochs=200
,batch_size=64
,verbose=1
,validation_data=(x_valid_norm, y_valid_split)
,callbacks=[
tf.keras.callbacks.ModelCheckpoint("A2_Exp_11_3CNN_2DNN_BN_TWEAK_L2001_DO05.h5",save_best_only=True,save_weights_only=False)
,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
]
)
# End time
end_time = time.time()
# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment11"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
2024-10-20 04:05:04.676861: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
704/704 [==============================] - 6s 6ms/step - loss: 3.2646 - accuracy: 0.2181 - val_loss: 2.6560 - val_accuracy: 0.2834 Epoch 2/200 704/704 [==============================] - 4s 6ms/step - loss: 2.2898 - accuracy: 0.3897 - val_loss: 2.2724 - val_accuracy: 0.3714 Epoch 3/200 704/704 [==============================] - 4s 6ms/step - loss: 1.9079 - accuracy: 0.4693 - val_loss: 1.8766 - val_accuracy: 0.4700 Epoch 4/200 704/704 [==============================] - 4s 6ms/step - loss: 1.7136 - accuracy: 0.5144 - val_loss: 1.7722 - val_accuracy: 0.4950 Epoch 5/200 704/704 [==============================] - 4s 6ms/step - loss: 1.6062 - accuracy: 0.5435 - val_loss: 1.7477 - val_accuracy: 0.5028 Epoch 6/200 704/704 [==============================] - 4s 6ms/step - loss: 1.5422 - accuracy: 0.5663 - val_loss: 1.5872 - val_accuracy: 0.5442 Epoch 7/200 704/704 [==============================] - 4s 6ms/step - loss: 1.4910 - accuracy: 0.5862 - val_loss: 1.6825 - val_accuracy: 0.5234 Epoch 8/200 704/704 [==============================] - 4s 6ms/step - loss: 1.4598 - accuracy: 0.6013 - val_loss: 1.4008 - val_accuracy: 0.6234 Epoch 9/200 704/704 [==============================] - 4s 6ms/step - loss: 1.4173 - accuracy: 0.6152 - val_loss: 1.3281 - val_accuracy: 0.6388 Epoch 10/200 704/704 [==============================] - 4s 6ms/step - loss: 1.3988 - accuracy: 0.6280 - val_loss: 1.3159 - val_accuracy: 0.6340 Epoch 11/200 704/704 [==============================] - 4s 6ms/step - loss: 1.3685 - accuracy: 0.6347 - val_loss: 1.2403 - val_accuracy: 0.6754 Epoch 12/200 704/704 [==============================] - 4s 6ms/step - loss: 1.3441 - accuracy: 0.6427 - val_loss: 1.3900 - val_accuracy: 0.6382 Epoch 13/200 704/704 [==============================] - 4s 6ms/step - loss: 1.3207 - accuracy: 0.6522 - val_loss: 1.2797 - val_accuracy: 0.6646 Epoch 14/200 704/704 [==============================] - 4s 6ms/step - loss: 1.3113 - accuracy: 0.6552 - val_loss: 1.1753 - val_accuracy: 0.7018 Epoch 15/200 704/704 [==============================] - 4s 6ms/step - loss: 1.2907 - accuracy: 0.6640 - val_loss: 1.2927 - val_accuracy: 0.6528 Epoch 16/200 704/704 [==============================] - 4s 6ms/step - loss: 1.2777 - accuracy: 0.6681 - val_loss: 1.2825 - val_accuracy: 0.6596 Epoch 17/200 704/704 [==============================] - 4s 6ms/step - loss: 1.2562 - accuracy: 0.6750 - val_loss: 1.1284 - val_accuracy: 0.7154 Epoch 18/200 704/704 [==============================] - 4s 6ms/step - loss: 1.2396 - accuracy: 0.6776 - val_loss: 1.0919 - val_accuracy: 0.7296 Epoch 19/200 704/704 [==============================] - 4s 6ms/step - loss: 1.2302 - accuracy: 0.6842 - val_loss: 1.2078 - val_accuracy: 0.6842 Epoch 20/200 704/704 [==============================] - 4s 6ms/step - loss: 1.2179 - accuracy: 0.6877 - val_loss: 1.0980 - val_accuracy: 0.7294 Epoch 21/200 704/704 [==============================] - 4s 6ms/step - loss: 1.2085 - accuracy: 0.6880 - val_loss: 1.0410 - val_accuracy: 0.7468 Epoch 22/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1906 - accuracy: 0.6977 - val_loss: 1.0670 - val_accuracy: 0.7310 Epoch 23/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1903 - accuracy: 0.6974 - val_loss: 1.0773 - val_accuracy: 0.7350 Epoch 24/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1740 - accuracy: 0.7000 - val_loss: 1.0295 - val_accuracy: 0.7496 Epoch 25/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1714 - accuracy: 0.7047 - val_loss: 1.0207 - val_accuracy: 0.7538 Epoch 26/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1622 - accuracy: 0.7055 - val_loss: 1.1109 - val_accuracy: 0.7238 Epoch 27/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1484 - accuracy: 0.7102 - val_loss: 1.1120 - val_accuracy: 0.7192 Epoch 28/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1425 - accuracy: 0.7108 - val_loss: 1.0151 - val_accuracy: 0.7596 Epoch 29/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1246 - accuracy: 0.7175 - val_loss: 1.0509 - val_accuracy: 0.7462 Epoch 30/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1254 - accuracy: 0.7176 - val_loss: 0.9764 - val_accuracy: 0.7674 Epoch 31/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1289 - accuracy: 0.7170 - val_loss: 1.1292 - val_accuracy: 0.7086 Epoch 32/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1192 - accuracy: 0.7207 - val_loss: 1.0041 - val_accuracy: 0.7562 Epoch 33/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1044 - accuracy: 0.7224 - val_loss: 0.9532 - val_accuracy: 0.7770 Epoch 34/200 704/704 [==============================] - 4s 6ms/step - loss: 1.1051 - accuracy: 0.7228 - val_loss: 0.9875 - val_accuracy: 0.7618 Epoch 35/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0855 - accuracy: 0.7286 - val_loss: 0.9614 - val_accuracy: 0.7684 Epoch 36/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0871 - accuracy: 0.7292 - val_loss: 1.0699 - val_accuracy: 0.7352 Epoch 37/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0866 - accuracy: 0.7287 - val_loss: 1.0312 - val_accuracy: 0.7486 Epoch 38/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0655 - accuracy: 0.7365 - val_loss: 0.9811 - val_accuracy: 0.7584 Epoch 39/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0672 - accuracy: 0.7348 - val_loss: 1.0472 - val_accuracy: 0.7358 Epoch 40/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0645 - accuracy: 0.7356 - val_loss: 0.9501 - val_accuracy: 0.7714 Epoch 41/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0560 - accuracy: 0.7378 - val_loss: 0.9597 - val_accuracy: 0.7682 Epoch 42/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0481 - accuracy: 0.7387 - val_loss: 0.9514 - val_accuracy: 0.7746 Epoch 43/200 704/704 [==============================] - 4s 6ms/step - loss: 1.0461 - accuracy: 0.7412 - val_loss: 0.9558 - val_accuracy: 0.7712 Time taken to train Model: 173.08 seconds
train_loss = history_11.history['loss'][-1] # Training loss from the last epoch
train_accuracy = history_11.history['accuracy'][-1] # Training accuracy from the last epoch
val_loss = history_11.history['val_loss'][-1] # Validation loss from the last epoch
val_accuracy = history_11.history['val_accuracy'][-1] # Validation accuracy from the last epoch
# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")
model_11 = tf.keras.models.load_model("A2_Exp_11_3CNN_2DNN_BN_TWEAK_L2001_DO05.h5")
test_loss, test_accuracy = model_11.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")
results["Experiment11"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment11"]["Test Loss"] = round(test_loss,3)
results["Experiment11"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment11"]["Train Loss"] = round(train_loss,3)
results["Experiment11"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment11"]["Validation Loss"] = round(val_loss,3)
Training Loss: 1.046, Training Accuracy: 0.741 Validation Loss: 0.956, Validation Accuracy: 0.771 Test Loss: 0.944, Test Accuracy: 0.772
pred11 = model_11.predict(x_test_norm)
print('shape of preds: ', pred11.shape)
313/313 [==============================] - 0s 1ms/step shape of preds: (10000, 10)
history_11_dict = history_11.history
history_11_df=pd.DataFrame(history_11_dict)
history_11_df.tail().round(3)
| loss | accuracy | val_loss | val_accuracy | |
|---|---|---|---|---|
| 38 | 1.067 | 0.735 | 1.047 | 0.736 |
| 39 | 1.064 | 0.736 | 0.950 | 0.771 |
| 40 | 1.056 | 0.738 | 0.960 | 0.768 |
| 41 | 1.048 | 0.739 | 0.951 | 0.775 |
| 42 | 1.046 | 0.741 | 0.956 | 0.771 |
Plotting Performance Metrics¶
We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.
plt.subplots(figsize=(16,12)) # l2_rate = 0.001, dropout_rate = 0.5
plt.tight_layout()
display_training_curves(history_11.history['accuracy'], history_11.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_11.history['loss'], history_11.history['val_loss'], 'loss', 212)
Confusion matrices¶
Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.
pred11_cm=np.argmax(pred11, axis=1)
print_validation_report(y_test, pred11_cm)
Classification Report
precision recall f1-score support
0 0.85 0.75 0.80 1000
1 0.91 0.88 0.90 1000
2 0.75 0.60 0.67 1000
3 0.70 0.52 0.60 1000
4 0.63 0.82 0.72 1000
5 0.76 0.62 0.68 1000
6 0.68 0.91 0.78 1000
7 0.87 0.80 0.83 1000
8 0.77 0.94 0.84 1000
9 0.87 0.87 0.87 1000
accuracy 0.77 10000
macro avg 0.78 0.77 0.77 10000
weighted avg 0.78 0.77 0.77 10000
Accuracy Score: 0.7717
Root Mean Square Error: 1.9351485731075018
plot_confusion_matrix(y_test,pred11_cm)
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred11[0:20], columns = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
| airplane | automobile | bird | cat | deer | dog | frog | horse | ship | truck | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.39% | 0.03% | 1.15% | 88.12% | 2.07% | 3.40% | 2.86% | 0.47% | 1.44% | 0.06% |
| 1 | 0.49% | 0.47% | 0.02% | 0.05% | 0.01% | 0.00% | 0.02% | 0.00% | 98.89% | 0.04% |
| 2 | 4.66% | 2.81% | 0.18% | 0.44% | 0.17% | 0.07% | 0.12% | 0.05% | 85.17% | 6.32% |
| 3 | 6.60% | 1.05% | 0.23% | 0.07% | 0.04% | 0.01% | 0.09% | 0.01% | 91.60% | 0.31% |
| 4 | 0.04% | 0.04% | 3.17% | 0.76% | 10.38% | 0.12% | 85.37% | 0.04% | 0.02% | 0.04% |
| 5 | 0.02% | 0.02% | 0.33% | 1.07% | 0.98% | 0.28% | 97.24% | 0.02% | 0.02% | 0.03% |
| 6 | 0.14% | 91.04% | 0.16% | 0.26% | 0.03% | 0.20% | 0.11% | 0.03% | 0.03% | 7.98% |
| 7 | 0.38% | 0.06% | 7.77% | 3.22% | 14.34% | 0.80% | 72.92% | 0.22% | 0.16% | 0.13% |
| 8 | 0.17% | 0.04% | 2.19% | 75.90% | 6.23% | 9.32% | 4.51% | 1.46% | 0.07% | 0.11% |
| 9 | 1.08% | 60.89% | 0.53% | 0.54% | 0.39% | 0.18% | 2.92% | 0.07% | 1.50% | 31.89% |
| 10 | 53.60% | 0.16% | 3.00% | 3.23% | 6.06% | 1.61% | 0.32% | 2.46% | 28.16% | 1.39% |
| 11 | 0.01% | 0.13% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.01% | 99.84% |
| 12 | 0.34% | 0.35% | 6.99% | 15.79% | 43.54% | 13.34% | 15.78% | 2.25% | 1.33% | 0.30% |
| 13 | 0.03% | 0.01% | 0.13% | 0.08% | 1.01% | 0.54% | 0.05% | 98.14% | 0.01% | 0.02% |
| 14 | 0.05% | 0.13% | 0.02% | 0.02% | 0.00% | 0.01% | 0.00% | 0.01% | 0.04% | 99.71% |
| 15 | 3.83% | 0.22% | 0.97% | 0.46% | 0.49% | 0.03% | 3.25% | 0.03% | 90.51% | 0.21% |
| 16 | 0.01% | 0.02% | 0.32% | 7.51% | 0.35% | 90.63% | 0.66% | 0.35% | 0.07% | 0.07% |
| 17 | 1.43% | 0.46% | 6.55% | 14.10% | 12.16% | 9.01% | 29.78% | 22.41% | 1.17% | 2.91% |
| 18 | 2.62% | 0.35% | 0.04% | 0.05% | 0.02% | 0.01% | 0.02% | 0.01% | 95.49% | 1.39% |
| 19 | 0.02% | 0.02% | 0.62% | 0.31% | 0.55% | 0.04% | 98.40% | 0.01% | 0.01% | 0.02% |
layer_names = []
for layer in model_11.layers:
layer_names.append(layer.name)
layer_names
['conv2d', 'max_pooling2d', 'dropout', 'conv2d_1', 'max_pooling2d_1', 'dropout_1', 'conv2d_2', 'max_pooling2d_2', 'dropout_2', 'flatten', 'dense', 'batch_normalization', 'dropout_3', 'dense_1', 'batch_normalization_1', 'dropout_4', 'dense_2']
# Extracts the outputs of the top 11 layers:
layer_outputs_11 = [layer.output for layer in model_11.layers[:14]]
# Creates a model that will return these outputs, given the model input:
activation_model_11 = tf.keras.models.Model(inputs=model_11.input, outputs=layer_outputs_11)
# Get activation values for the last dense layer
# activations_11 = activation_model_11.predict(x_valid_norm[:3250])
activations_11 = activation_model_11.predict(x_valid_norm[:1200])
dense_layer_activations_11 = activations_11[-4]
output_layer_activations_11 = activations_11[-1]
38/38 [==============================] - 0s 1ms/step
activations_11[-3].shape
(1200, 384)
sklearn.manifold.TSNE¶
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_11 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_11 = tsne_11.fit_transform(dense_layer_activations_11)
# Scaling
tsne_results_11 = (tsne_results_11 - tsne_results_11.min()) / (tsne_results_11.max() - tsne_results_11.min())
[t-SNE] Computing 121 nearest neighbors... [t-SNE] Indexed 1200 samples in 0.001s... [t-SNE] Computed neighbors for 1200 samples in 0.021s... [t-SNE] Computed conditional probabilities for sample 1000 / 1200 [t-SNE] Computed conditional probabilities for sample 1200 / 1200 [t-SNE] Mean sigma: 7.921385 [t-SNE] KL divergence after 250 iterations with early exaggeration: 61.436398 [t-SNE] KL divergence after 300 iterations: 1.002880
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_11[:,0],tsne_results_11[:,1], c=y_valid[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_11[:,0],tsne_results_11[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)
image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_11):
dist = np.sum((position - image_positions) ** 2, axis=1)
if np.min(dist) > 0.02: # if far enough from other images
image_positions = np.r_[image_positions, [position]]
imagebox = mpl.offsetbox.AnnotationBbox(
mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
position, bboxprops={"lw": 1})
plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
Result1: Create a table with the accuracy and loss for train/test/validation & process time for ALL the models.
# Convert the dictionary to a DataFrame
resultDf = pd.DataFrame(results).T
# Replace '\n' with '<br>' in the architecture details
resultDf['Architecture'] = resultDf['Architecture'].str.replace('\n', '<br>')
# Display the table with proper HTML rendering for line breaks
display(HTML(resultDf.to_html(escape=False)))
| Architecture | Train Time | Test Accuracy | Test Loss | Train Accuracy | Train Loss | Validation Accuracy | Validation Loss | |
|---|---|---|---|---|---|---|---|---|
| Experiment1 | • DNN with 2 layers • no regularization |
21.78 seconds | 0.474 | 1.478 | 0.53 | 1.311 | 0.461 | 1.55 |
| Experiment2 | • DNN with 3 layers • no regularization |
36.06 seconds | 0.474 | 1.466 | 0.762 | 0.655 | 0.454 | 2.307 |
| Experiment3 | • CNN with 2 layers/max pooling layers • 1 full-connected layer • no regularization |
60.82 seconds | 0.713 | 0.865 | 0.97 | 0.085 | 0.685 | 2.025 |
| Experiment4 | • CNN with 3 layers/max pooling layers • 1 full-connected layer • no regularization |
119.0 seconds | 0.736 | 0.794 | 0.985 | 0.048 | 0.723 | 2.322 |
| Experiment5 | • DNN with 2 layers (384, 768) • Batch Normalization • L2 Regularization(0.001) |
41.89 seconds | 0.49 | 1.482 | 0.65 | 0.991 | 0.477 | 1.637 |
| Experiment6 | • DNN with 3 layers • Regularization: batch normalization |
56.52 seconds | 0.483 | 1.48 | 0.792 | 0.581 | 0.478 | 2.413 |
| Experiment7 | • CNN with 2 layers/max pooling layers • L2 Regularization(0.001) |
145.38 seconds | 0.701 | 0.907 | 0.993 | 0.022 | 0.724 | 1.894 |
| Experiment8 | • CNN with 3 layers/max pooling layers • L2 Regularization(0.001) |
101.77 seconds | 0.699 | 0.944 | 0.984 | 0.047 | 0.701 | 2.117 |
| Experiment9 | • CNN with 3 layers/max pooling layers • Dropout(0.3) • L2 Regularization(0.001) • Batch Normalization |
185.3 seconds | 0.798 | 0.743 | 0.853 | 0.574 | 0.795 | 0.746 |
| Experiment10 | • CNN with 3 layers/max pooling layers • 2 Fully-Connected Hidden Layers • Dropout(0.3) • L2 Regularization(0.001) • Batch Normalization |
284.72 seconds | 0.811 | 0.741 | 0.88 | 0.49 | 0.804 | 0.772 |
| Experiment11 | • CNN with 3 layers/max pooling layers • 2 Fully-Connected Hidden Layers • Dropout(variable) • L2 Regularization(0.001) • Batch Normalization |
173.08 seconds | 0.772 | 0.944 | 0.741 | 1.046 | 0.771 | 0.956 |